linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-08-13	btrfs: only run the extent map shrinker from kswapd tasks	Filipe Manana
	Currently the extent map shrinker can be run by any task when attempting to allocate memory and there's enough memory pressure to trigger it. To avoid too much latency we stop iterating over extent maps and removing them once the task needs to reschedule. This logic was introduced in commit b3ebb9b7e92a ("btrfs: stop extent map shrinker if reschedule is needed"). While that solved high latency problems for some use cases, it's still not enough because with a too high number of tasks entering the extent map shrinker code, either due to memory allocations or because they are a kswapd task, we end up having a very high level of contention on some spin locks, namely: 1) The fs_info->fs_roots_radix_lock spin lock, which we need to find roots to iterate over their inodes; 2) The spin lock of the xarray used to track open inodes for a root (struct btrfs_root::inodes) - on 6.10 kernels and below, it used to be a red black tree and the spin lock was root->inode_lock; 3) The fs_info->delayed_iput_lock spin lock since the shrinker adds delayed iputs (calls btrfs_add_delayed_iput()). Instead of allowing the extent map shrinker to be run by any task, make it run only by kswapd tasks. This still solves the problem of running into OOM situations due to an unbounded extent map creation, which is simple to trigger by direct IO writes, as described in the changelog of commit 956a17d9d050 ("btrfs: add a shrinker for extent maps"), and by a similar case when doing buffered IO on files with a very large number of holes (keeping the file open and creating many holes, whose extent maps are only released when the file is closed). Reported-by: kzd <kzd@56709.net> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219121 Reported-by: Octavia Togami <octavia.togami@gmail.com> Link: https://lore.kernel.org/linux-btrfs/CAHPNGSSt-a4ZZWrtJdVyYnJFscFjP9S7rMcvEMaNSpR556DdLA@mail.gmail.com/ Fixes: 956a17d9d050 ("btrfs: add a shrinker for extent maps") CC: stable@vger.kernel.org # 6.10+ Tested-by: kzd <kzd@56709.net> Tested-by: Octavia Togami <octavia.togami@gmail.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-08-13	ALSA: hda: cs35l41: Remove redundant call to hda_cs_dsp_control_remove()	Richard Fitzgerald
	The driver doesn't create any ALSA controls for firmware controls, so it shouldn't be calling hda_cs_dsp_control_remove(). commit 312c04cee408 ("ALSA: hda: cs35l41: Stop creating ALSA Controls for firmware coefficients") removed the call to hda_cs_dsp_add_controls() but didn't remove the call for destroying those controls. Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Fixes: 312c04cee408 ("ALSA: hda: cs35l41: Stop creating ALSA Controls for firmware coefficients") Link: https://patch.msgid.link/20240813113209.648-1-rf@opensource.cirrus.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-08-13	btrfs: tree-checker: reject BTRFS_FT_UNKNOWN dir type	Qu Wenruo
	[REPORT] There is a bug report that kernel is rejecting a mismatching inode mode and its dir item: [ 1881.553937] BTRFS critical (device dm-0): inode mode mismatch with dir: inode mode=040700 btrfs type=2 dir type=0 [CAUSE] It looks like the inode mode is correct, while the dir item type 0 is BTRFS_FT_UNKNOWN, which should not be generated by btrfs at all. This may be caused by a memory bit flip. [ENHANCEMENT] Although tree-checker is not able to do any cross-leaf verification, for this particular case we can at least reject any dir type with BTRFS_FT_UNKNOWN. So here we enhance the dir type check from [0, BTRFS_FT_MAX), to (0, BTRFS_FT_MAX). Although the existing corruption can not be fixed just by such enhanced checking, it should prevent the same 0x2->0x0 bitflip for dir type to reach disk in the future. Reported-by: Kota <nospam@kota.moe> Link: https://lore.kernel.org/linux-btrfs/CACsxjPYnQF9ZF-0OhH16dAx50=BXXOcP74MxBc3BG+xae4vTTw@mail.gmail.com/ CC: stable@vger.kernel.org # 5.4+ Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-08-13	btrfs: check delayed refs when we're checking if a ref exists	Josef Bacik
	In the patch 78c52d9eb6b7 ("btrfs: check for refs on snapshot delete resume") I added some code to handle file systems that had been corrupted by a bug that incorrectly skipped updating the drop progress key while dropping a snapshot. This code would check to see if we had already deleted our reference for a child block, and skip the deletion if we had already. Unfortunately there is a bug, as the check would only check the on-disk references. I made an incorrect assumption that blocks in an already deleted snapshot that was having the deletion resume on mount wouldn't be modified. If we have 2 pending deleted snapshots that share blocks, we can easily modify the rules for a block. Take the following example subvolume a exists, and subvolume b is a snapshot of subvolume a. They share references to block 1. Block 1 will have 2 full references, one for subvolume a and one for subvolume b, and it belongs to subvolume a (btrfs_header_owner(block 1) == subvolume a). When deleting subvolume a, we will drop our full reference for block 1, and because we are the owner we will drop our full reference for all of block 1's children, convert block 1 to FULL BACKREF, and add a shared reference to all of block 1's children. Then we will start the snapshot deletion of subvolume b. We look up the extent info for block 1, which checks delayed refs and tells us that FULL BACKREF is set, so sets parent to the bytenr of block 1. However because this is a resumed snapshot deletion, we call into check_ref_exists(). Because check_ref_exists() only looks at the disk, it doesn't find the shared backref for the child of block 1, and thus returns 0 and we skip deleting the reference for the child of block 1 and continue. This orphans the child of block 1. The fix is to lookup the delayed refs, similar to what we do in btrfs_lookup_extent_info(). However we only care about whether the reference exists or not. If we fail to find our reference on disk, go look up the bytenr in the delayed refs, and if it exists look for an existing ref in the delayed ref head. If that exists then we know we can delete the reference safely and carry on. If it doesn't exist we know we have to skip over this block. This bug has existed since I introduced this fix, however requires having multiple deleted snapshots pending when we unmount. We noticed this in production because our shutdown path stops the container on the system, which deletes a bunch of subvolumes, and then reboots the box. This gives us plenty of opportunities to hit this issue. Looking at the history we've seen this occasionally in production, but we had a big spike recently thanks to faster machines getting jobs with multiple subvolumes in the job. Chris Mason wrote a reproducer which does the following mount /dev/nvme4n1 /btrfs btrfs subvol create /btrfs/s1 simoop -E -f 4k -n 200000 -z /btrfs/s1 while(true) ; do btrfs subvol snap /btrfs/s1 /btrfs/s2 simoop -f 4k -n 200000 -r 10 -z /btrfs/s2 btrfs subvol snap /btrfs/s2 /btrfs/s3 btrfs balance start -dusage=80 /btrfs btrfs subvol del /btrfs/s2 /btrfs/s3 umount /btrfs btrfsck /dev/nvme4n1 \|\| exit 1 mount /dev/nvme4n1 /btrfs done On the second loop this would fail consistently, with my patch it has been running for hours and hasn't failed. I also used dm-log-writes to capture the state of the failure so I could debug the problem. Using the existing failure case to test my patch validated that it fixes the problem. Fixes: 78c52d9eb6b7 ("btrfs: check for refs on snapshot delete resume") CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-08-13	ASoC: MAINTAINERS: Drop Banajit Goswami from Qualcomm sound drivers	Krzysztof Kozlowski
	There was no active maintenance from Banajit Goswami - last email is from 2019 - so make obvious that Qualcomm sound drivers are maintained by only one person. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20240730103511.21728-1-krzysztof.kozlowski@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2024-08-13	ASoC: SOF: amd: Fix for incorrect acp error register offsets	Vijendar Mukunda
	Addition of 'dsp_intr_base' to ACP error register offsets points to wrong register offsets in irq handler. Correct the acp error register offsets. ACP error status register offset and acp error reason register offset got changed from ACP6.0 onwards. Add 'acp_error_stat' and 'acp_sw0_i2s_err_reason' as descriptor fields in sof_amd_acp_desc structure and update the values based on the ACP variant. >From Rembrandt platform onwards, errors related to SW1 Soundwire manager instance/I2S controller connected on P1 power tile is reported with ACP_SW1_I2S_ERROR_REASON register. Add conditional check for the same. Fixes: 96eb81851012 ("ASoC: SOF: amd: add interrupt handling for SoundWire manager devices") Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com> Link: https://patch.msgid.link/20240813105944.3126903-2-Vijendar.Mukunda@amd.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-08-13	ASoC: SOF: amd: move iram-dram fence register programming sequence	Vijendar Mukunda
	The existing code modifies IRAM and DRAM size after sha dma start for vangogh platform. The problem with this sequence is that it might cause sha dma failure when firmware code binary size is greater than the default IRAM size. To fix this issue, Move the iram-dram fence register sequence prior to sha dma start. Fixes: 094d11768f74 ("ASoC: SOF: amd: Skip IRAM/DRAM size modification for Steam Deck OLED") Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com> Link: https://patch.msgid.link/20240813105944.3126903-1-Vijendar.Mukunda@amd.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-08-13	ALSA: hda: cs35l56: Remove redundant call to hda_cs_dsp_control_remove()	Richard Fitzgerald
	The driver doesn't create any ALSA controls for firmware controls, so it shouldn't be calling hda_cs_dsp_control_remove(). commit 34e1b1bb7324 ("ALSA: hda: cs35l56: Stop creating ALSA controls for firmware coefficients") removed the call to hda_cs_dsp_add_controls() but didn't remove the call for destroying those controls. Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Fixes: 34e1b1bb7324 ("ALSA: hda: cs35l56: Stop creating ALSA controls for firmware coefficients") Link: https://patch.msgid.link/20240813110750.2814-1-rf@opensource.cirrus.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-08-13	net: mana: Fix doorbell out of order violation and avoid unnecessary ↵	Long Li
	doorbell rings After napi_complete_done() is called when NAPI is polling in the current process context, another NAPI may be scheduled and start running in softirq on another CPU and may ring the doorbell before the current CPU does. When combined with unnecessary rings when there is no need to arm the CQ, it triggers error paths in the hardware. This patch fixes this by calling napi_complete_done() after doorbell rings. It limits the number of unnecessary rings when there is no need to arm. MANA hardware specifies that there must be one doorbell ring every 8 CQ wraparounds. This driver guarantees one doorbell ring as soon as the number of consumed CQEs exceeds 4 CQ wraparounds. In practical workloads, the 4 CQ wraparounds proves to be big enough that it rarely exceeds this limit before all the napi weight is consumed. To implement this, add a per-CQ counter cq->work_done_since_doorbell, and make sure the CQ is armed as soon as passing 4 wraparounds of the CQ. Cc: stable@vger.kernel.org Fixes: e1b5683ff62e ("net: mana: Move NAPI from EQ to CQ") Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Long Li <longli@microsoft.com> Link: https://patch.msgid.link/1723219138-29887-1-git-send-email-longli@linuxonhyperv.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-13	KVM: SVM: Fix an error code in sev_gmem_post_populate()	Dan Carpenter
	The copy_from_user() function returns the number of bytes which it was not able to copy. Return -EFAULT instead. Fixes: dee5a47cc7a4 ("KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Message-ID: <20240612115040.2423290-4-dan.carpenter@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-08-13	Merge tag 'kvm-s390-master-6.11-1' of ↵	Paolo Bonzini
	https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD Fix invalid gisa designation value when gisa is not in use. Panic if (un)share fails to maintain security.
2024-08-13	Merge tag 'kvmarm-fixes-6.11-1' of ↵	Paolo Bonzini
	git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.11, round #1 - Use kvfree() for the kvmalloc'd nested MMUs array - Set of fixes to address warnings in W=1 builds - Make KVM depend on assembler support for ARMv8.4 - Fix for vgic-debug interface for VMs without LPIs - Actually check ID_AA64MMFR3_EL1.S1PIE in get-reg-list selftest - Minor code / comment cleanups for configuring PAuth traps - Take kvm->arch.config_lock to prevent destruction / initialization race for a vCPU's CPUIF which may lead to a UAF
2024-08-13	KVM: SVM: Fix uninitialized variable bug	Dan Carpenter
	If snp_lookup_rmpentry() fails then "assigned" is printed in the error message but it was never initialized. Initialize it to false. Fixes: dee5a47cc7a4 ("KVM: SEV: Add KVM_SEV_SNP_LAUNCH_UPDATE command") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Message-ID: <20240612115040.2423290-3-dan.carpenter@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-08-13	net: hinic: use ethtool_sprintf/puts	Rosen Penev
	Simpler and avoids manual pointer addition. Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20240809044957.4534-1-rosenp@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-13	Merge tag 'ath-next-20240812' of ↵	Kalle Valo
	git://git.kernel.org/pub/scm/linux/kernel/git/ath/ath ath.git patches for v6.12 This is a fairly light pull request since ath12k is still working on MLO-related changes, and the other drivers are mostly in maintenance mode with a few cleanups and bug fixes. Major changes: ath12k * DebugFS support for transmit DE stats * Make ASPM support hardware-dependent * Align BSS Channel information command and message with firmware ath11k * Use work queue for beacon tx events ath9k * Use devm for gpio_request_one * Use unmanaged PCI functions in ath9k_pci_owl_loader()
2024-08-13	wifi: mwl8k: Use static_assert() to check struct sizes	Gustavo A. R. Silva
	Commit 5c4250092fad ("wifi: mwl8k: Avoid -Wflex-array-member-not-at-end warnings") introduced tagged `struct mwl8k_cmd_pkt_hdr`. We want to ensure that when new members need to be added to the flexible structure, they are always included within this tagged struct. We use `static_assert()` to ensure that the memory layout for both the flexible structure and the tagged struct is the same after any changes. Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/ZrVCg51Q9M2fTPaF@cute
2024-08-13	Merge tag 'ath-current-20240812' of ↵	Kalle Valo
	git://git.kernel.org/pub/scm/linux/kernel/git/ath/ath ath.git patch for v6.11 We have a single patch for the next 6.11-rc which introduces a workaround to ath12k which addresses a WCN7850 hardware issue that prevents proper operation with unaligned transmit buffers.
2024-08-13	wifi: iwlwifi: correctly lookup DMA address in SG table	Benjamin Berg
	The code to lookup the scatter gather table entry assumed that it was possible to use sg_virt() in order to lookup the DMA address in a mapped scatter gather table. However, this assumption is incorrect as the DMA mapping code may merge multiple entries into one. In that case, the DMA address space may have e.g. two consecutive pages which is correctly represented by the scatter gather list entry, however the virtual addresses for these two pages may differ and the relationship cannot be resolved anymore. Avoid this problem entirely by working with the offset into the mapped area instead of using virtual addresses. With that we only use the DMA length and DMA address from the scatter gather list entries. The underlying DMA/IOMMU code is therefore free to merge two entries into one even if the virtual addresses space for the area is not continuous. Fixes: 90db50755228 ("wifi: iwlwifi: use already mapped data when TXing an AMSDU") Reported-by: Chris Bainbridge <chris.bainbridge@gmail.com> Closes: https://lore.kernel.org/r/ZrNRoEbdkxkKFMBi@debian.local Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Tested-by: Chris Bainbridge <chris.bainbridge@gmail.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/20240812110640.460514-1-benjamin@sipsolutions.net
2024-08-13	wifi: mt76: mt7921: fix NULL pointer access in mt7921_ipv6_addr_change	Bert Karwatzki
	When disabling wifi mt7921_ipv6_addr_change() is called as a notifier. At this point mvif->phy is already NULL so we cannot use it here. Signed-off-by: Bert Karwatzki <spasswolf@web.de> Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/20240812104542.80760-1-spasswolf@web.de
2024-08-13	mips: sgi-ip22: Fix the build	Bart Van Assche
	Fix a recently introduced build failure. Fixes: d69d80484598 ("driver core: have match() callback in struct bus_type take a const *") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240805232026.65087-3-bvanassche@acm.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-13	ARM: riscpc: ecard: Fix the build	Bart Van Assche
	Fix a recently introduced build failure. Cc: Russell King <rmk+kernel@armlinux.org.uk> Fixes: d69d80484598 ("driver core: have match() callback in struct bus_type take a const *") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240805232026.65087-2-bvanassche@acm.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-13	tty: atmel_serial: use the correct RTS flag.	Mathieu Othacehe
	In RS485 mode, the RTS pin is driven high by hardware when the transmitter is operating. This behaviour cannot be changed. This means that the driver should claim that it supports SER_RS485_RTS_ON_SEND and not SER_RS485_RTS_AFTER_SEND. Otherwise, when configuring the port with the SER_RS485_RTS_ON_SEND, one get the following warning: kern.warning kernel: atmel_usart_serial atmel_usart_serial.2.auto: ttyS1 (1): invalid RTS setting, using RTS_AFTER_SEND instead which is contradictory with what's really happening. Signed-off-by: Mathieu Othacehe <othacehe@gnu.org> Cc: stable <stable@kernel.org> Tested-by: Alexander Dahl <ada@thorsis.com> Fixes: af47c491e3c7 ("serial: atmel: Fill in rs485_supported") Link: https://lore.kernel.org/r/20240808060637.19886-1-othacehe@gnu.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-13	tty: vt: conmakehash: remove non-portable code printing comment header	Masahiro Yamada
	Commit 6e20753da6bc ("tty: vt: conmakehash: cope with abs_srctree no longer in env") included <linux/limits.h>, which invoked another (wrong) patch that tried to address a build error on macOS. According to the specification [1], the correct header to use PATH_MAX is <limits.h>. The minimal fix would be to replace <linux/limits.h> with <limits.h>. However, the following commits seem questionable to me: - 3bd85c6c97b2 ("tty: vt: conmakehash: Don't mention the full path of the input in output") - 6e20753da6bc ("tty: vt: conmakehash: cope with abs_srctree no longer in env") These commits made too many efforts to cope with a comment header in drivers/tty/vt/consolemap_deftbl.c: /* * Do not edit this file; it was automatically generated by * * conmakehash drivers/tty/vt/cp437.uni > [this file] * / With this commit, the header part of the generate C file will be simplified as follows: / * Automatically generated file; Do not edit. */ BTW, another series of excessive efforts for a comment header can be seen in the following: - 5ef6dc08cfde ("lib/build_OID_registry: don't mention the full path of the script in output") - 2fe29fe94563 ("lib/build_OID_registry: avoid non-destructive substitution for Perl < 5.13.2 compat") [1]: https://pubs.opengroup.org/onlinepubs/009695399/basedefs/limits.h.html Fixes: 6e20753da6bc ("tty: vt: conmakehash: cope with abs_srctree no longer in env") Cc: stable <stable@kernel.org> Reported-by: Daniel Gomez <da.gomez@samsung.com> Closes: https://lore.kernel.org/all/20240807-macos-build-support-v1-11-4cd1ded85694@samsung.com/ Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20240809160853.1269466-1-masahiroy@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-13	tty: serial: fsl_lpuart: mark last busy before uart_add_one_port	Peng Fan
	With "earlycon initcall_debug=1 loglevel=8" in bootargs, kernel sometimes boot hang. It is because normal console still is not ready, but runtime suspend is called, so early console putchar will hang in waiting TRDE set in UARTSTAT. The lpuart driver has auto suspend delay set to 3000ms, but during uart_add_one_port, a child device serial ctrl will added and probed with its pm runtime enabled(see serial_ctrl.c). The runtime suspend call path is: device_add \|-> bus_probe_device \|->device_initial_probe \|->__device_attach \|-> pm_runtime_get_sync(dev->parent); \|-> pm_request_idle(dev); \|-> pm_runtime_put(dev->parent); So in the end, before normal console ready, the lpuart get runtime suspended. And earlycon putchar will hang. To address the issue, mark last busy just after pm_runtime_enable, three seconds is long enough to switch from bootconsole to normal console. Fixes: 43543e6f539b ("tty: serial: fsl_lpuart: Add runtime pm support") Cc: stable <stable@kernel.org> Signed-off-by: Peng Fan <peng.fan@nxp.com> Link: https://lore.kernel.org/r/20240808140325.580105-1-peng.fan@oss.nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-13	Merge branch 'net-netconsole-fix-netconsole-unsafe-locking'	Paolo Abeni
	Breno Leitao says: ==================== net: netconsole: Fix netconsole unsafe locking Problem: ======= The current locking mechanism in netconsole is unsafe and suboptimal due to the following issues: 1) Lock Release and Reacquisition Mid-Loop: In netconsole_netdev_event(), the target_list_lock is released and reacquired within a loop, potentially causing collisions and cleaning up targets that are being enabled. int netconsole_netdev_event() { ... spin_lock_irqsave(&target_list_lock, flags); list_for_each_entry(nt, &target_list, list) { spin_unlock_irqrestore(&target_list_lock, flags); __netpoll_cleanup(&nt->np); spin_lock_irqsave(&target_list_lock, flags); } spin_lock_irqsave(&target_list_lock, flags); ... } 2) Non-Atomic Cleanup Operations: In enabled_store(), the cleanup of structures is not atomic, risking cleanup of structures that are in the process of being enabled. size_t enabled_store() { ... spin_lock_irqsave(&target_list_lock, flags); nt->enabled = false; spin_unlock_irqrestore(&target_list_lock, flags); netpoll_cleanup(&nt->np); ... } These issues stem from the following limitations in netconsole's locking design: 1) write_{ext_}msg() functions: a) Cannot sleep b) Must iterate through targets and send messages to all enabled entries. c) List iteration is protected by target_list_lock spinlock. 2) Network event handling in netconsole_netdev_event(): a) Needs to sleep b) Requires iteration over the target list (holding target_list_lock spinlock). c) Some events necessitate netpoll struct cleanup, which needs to sleep. The target_list_lock needs to be used by non-sleepable functions while also protecting operations that may sleep, leading to the current unsafe design. Solution: ======== 1) Dual Locking Mechanism: - Retain current target_list_lock for non-sleepable use cases. - Introduce target_cleanup_list_lock (mutex) for sleepable operations. 2) Deferred Cleanup: - Implement atomic, deferred cleanup of structures using the new mutex (target_cleanup_list_lock). - Avoid the `goto` in the middle of the list_for_each_entry 3) Separate Cleanup List: - Create target_cleanup_list for deferred cleanup, protected by target_cleanup_list_lock. - This allows cleanup() to sleep without affecting message transmission. - When iterating over targets, move devices needing cleanup to target_cleanup_list. - Handle cleanup under the target_cleanup_list_lock mutex. 4) Make a clear locking hierarchy - The target_cleanup_list_lock takes precedence over target_list_lock. - Major Workflow Locking Sequences: a) Network Event Affecting Netpoll (netconsole_netdev_event): rtnl -> target_cleanup_list_lock -> target_list_lock b) Message Writing (write_msg()): console_lock -> target_list_lock c) Configfs Target Enable/Disable (enabled_store()): dynamic_netconsole_mutex -> target_cleanup_list_lock -> target_list_lock This hierarchy ensures consistent lock acquisition order across different operations, preventing deadlocks and maintaining proper synchronization. The target_cleanup_list_lock's higher priority allows for safe deferred cleanup operations without interfering with regular message transmission protected by target_list_lock. Each workflow follows a specific locking sequence, ensuring that operations like network event handling, message writing, and target management are properly synchronized and do not conflict with each other. Changelog: v3: * Move netconsole_process_cleanups() function to inside CONFIG_NETCONSOLE_DYNAMIC block, avoiding Werror=unused-function (Jakub) v2: * The selftest has been removed from the patchset because veth is now IFF_DISABLE_NETPOLL. A new test will be sent separately. * https://lore.kernel.org/all/20240807091657.4191542-1-leitao@debian.org/ v1: * https://lore.kernel.org/all/20240801161213.2707132-1-leitao@debian.org/ ==================== Link: https://patch.msgid.link/20240808122518.498166-1-leitao@debian.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-13	net: netconsole: Defer netpoll cleanup to avoid lock release during list ↵	Breno Leitao
	traversal Current issue: - The `target_list_lock` spinlock is held while iterating over target_list() entries. - Mid-loop, the lock is released to call __netpoll_cleanup(), then reacquired. - This practice compromises the protection provided by `target_list_lock`. Reason for current design: 1. __netpoll_cleanup() may sleep, incompatible with holding a spinlock. 2. target_list_lock must be a spinlock because write_msg() cannot sleep. (See commit b5427c27173e ("[NET] netconsole: Support multiple logging targets")) Defer the cleanup of the netpoll structure to outside the target_list_lock() protected area. Create another list (target_cleanup_list) to hold the entries that need to be cleaned up, and clean them using a mutex (target_cleanup_list_lock). Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-13	net: netconsole: Unify Function Return Paths	Breno Leitao
	The return flow in netconsole's dynamic functions is currently inconsistent. This patch aims to streamline and standardize the process by ensuring that the mutex is unlocked before returning the ret value. Additionally, this update includes a minor functional change where certain strnlen() operations are performed with the dynamic_netconsole_mutex locked. This adjustment is not anticipated to cause any issues, however, it is crucial to document this change for clarity. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-13	net: netconsole: Standardize variable naming	Breno Leitao
	Update variable names from err to ret in cases where the variable may return non-error values. This change facilitates a forthcoming patch that relies on ret being used consistently to handle return values, regardless of whether they indicate an error or not. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-13	net: netconsole: Correct mismatched return types	Breno Leitao
	netconsole incorrectly mixes int and ssize_t types by using int for return variables in functions that should return ssize_t. This is fixed by updating the return variables to the appropriate ssize_t type, ensuring consistency across the function definitions. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-13	net: netpoll: extract core of netpoll_cleanup	Breno Leitao
	Extract the core part of netpoll_cleanup(), so, it could be called from a caller that has the rtnl lock already. Netconsole uses this in a weird way right now: __netpoll_cleanup(&nt->np); spin_lock_irqsave(&target_list_lock, flags); netdev_put(nt->np.dev, &nt->np.dev_tracker); nt->np.dev = NULL; nt->enabled = false; This will be replaced by do_netpoll_cleanup() as the locking situation is overhauled. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Rik van Riel <riel@surriel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-13	iommu: Remove unused declaration iommu_sva_unbind_gpasid()	Yue Haibing
	Commit 0c9f17877891 ("iommu: Remove guest pasid related interfaces and definitions") removed the implementation but leave declaration. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240808140619.2498535-1-yuehaibing@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-08-13	irqdomain: Remove stray '-' in the domain name	Andy Shevchenko
	When the domain suffix is not supplied alloc_fwnode_name() unconditionally adds a separator. Fix the format strings to get rid of the stray '-' separator. Fixes: 1e7c05292531 ("irqdomain: Allow giving name suffix for domain") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20240812193101.1266625-3-andriy.shevchenko@linux.intel.com
2024-08-13	irqdomain: Clarify checks for bus_token	Andy Shevchenko
	The code uses if (bus_token) and if (bus_token == DOMAIN_BUS_ANY). Since bus_token is an enum, the latter is more robust against changes. Convert all !bus_token checks to explicitely check for DOMAIN_BUS_ANY. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20240812193101.1266625-2-andriy.shevchenko@linux.intel.com
2024-08-13	arm64: dts: imx8mm-phygate: fix typo pinctrcl-0	Frank Li
	Fix typo pinctrcl-0 with pinctrl-0. Fix below warning: arch/arm64/boot/dts/freescale/imx8mm-phygate-tauri-l-rs232-rs485.dtb: gpio@30220000: 'pinctrl-0' is a dependency of 'pinctrl-names' from schema $id: http://devicetree.org/schemas/pinctrl/pinctrl-consumer.yaml# arch/arm64/boot/dts/freescale/imx8mm-phygate-tauri-l-rs232-rs485.dtb: uart4_rs485_en: $nodename:0: 'uart4_rs485_en' does not match '^(hog-[0-9]+\|.+-hog(-[0-9]+)?)$ Fixes: 8d97083c0b5d ("arm64: dts: phygate-tauri-l: add overlays for RS232 and RS485") Reviewed-by: Teresa Remmet <t.remmet@phytec.de> Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2024-08-13	usb: misc: ljca: Add Lunar Lake ljca GPIO HID to ljca_gpio_hids[]	Hans de Goede
	Add LJCA GPIO support for the Lunar Lake platform. New HID taken from out of tree ivsc-driver git repo. Link: https://github.com/intel/ivsc-driver/commit/47e7c4a446c8ea8c741ff5a32fa7b19f9e6fd47e Cc: stable <stable@kernel.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20240812095038.555837-1-hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-13	Revert "usb: typec: tcpm: clear pd_event queue in PORT_RESET"	Xu Yang
	This reverts commit bf20c69cf3cf9c6445c4925dd9a8a6ca1b78bfdf. During tcpm_init() stage, if the VBUS is still present after tcpm_reset_port(), then we assume that VBUS will off and goto safe0v after a specific discharge time. Following a TCPM_VBUS_EVENT event if VBUS reach to off state. TCPM_VBUS_EVENT event may be set during PORT_RESET handling stage. If pd_events reset to 0 after TCPM_VBUS_EVENT set, we will lost this VBUS event. Then the port state machine may stuck at one state. Before: [ 2.570172] pending state change PORT_RESET -> PORT_RESET_WAIT_OFF @ 100 ms [rev1 NONE_AMS] [ 2.570179] state change PORT_RESET -> PORT_RESET_WAIT_OFF [delayed 100 ms] [ 2.570182] pending state change PORT_RESET_WAIT_OFF -> SNK_UNATTACHED @ 920 ms [rev1 NONE_AMS] [ 3.490213] state change PORT_RESET_WAIT_OFF -> SNK_UNATTACHED [delayed 920 ms] [ 3.490220] Start toggling [ 3.546050] CC1: 0 -> 0, CC2: 0 -> 2 [state TOGGLING, polarity 0, connected] [ 3.546057] state change TOGGLING -> SRC_ATTACH_WAIT [rev1 NONE_AMS] After revert this patch, we can see VBUS off event and the port will goto expected state. [ 2.441992] pending state change PORT_RESET -> PORT_RESET_WAIT_OFF @ 100 ms [rev1 NONE_AMS] [ 2.441999] state change PORT_RESET -> PORT_RESET_WAIT_OFF [delayed 100 ms] [ 2.442002] pending state change PORT_RESET_WAIT_OFF -> SNK_UNATTACHED @ 920 ms [rev1 NONE_AMS] [ 2.442122] VBUS off [ 2.442125] state change PORT_RESET_WAIT_OFF -> SNK_UNATTACHED [rev1 NONE_AMS] [ 2.442127] VBUS VSAFE0V [ 2.442351] CC1: 0 -> 0, CC2: 0 -> 0 [state SNK_UNATTACHED, polarity 0, disconnected] [ 2.442357] Start toggling [ 2.491850] CC1: 0 -> 0, CC2: 0 -> 2 [state TOGGLING, polarity 0, connected] [ 2.491858] state change TOGGLING -> SRC_ATTACH_WAIT [rev1 NONE_AMS] [ 2.491863] pending state change SRC_ATTACH_WAIT -> SNK_TRY @ 200 ms [rev1 NONE_AMS] [ 2.691905] state change SRC_ATTACH_WAIT -> SNK_TRY [delayed 200 ms] Fixes: bf20c69cf3cf ("usb: typec: tcpm: clear pd_event queue in PORT_RESET") Cc: stable@vger.kernel.org Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20240809112901.535072-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-13	usb: typec: ucsi: Fix the return value of ucsi_run_command()	Heikki Krogerus
	The command execution routines need to return the amount of data that was transferred when succesful. This fixes an issue where the alternate modes and the power delivery capabilities are not getting registered. Fixes: 5e9c1662a89b ("usb: typec: ucsi: rework command execution functions") Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20240809150343.286942-1-heikki.krogerus@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-13	usb: xhci: fix duplicate stall handling in handle_tx_event()	Niklas Neronin
	Stall handling is managed in the 'process_' functions, which are called right before the 'goto' stall handling code snippet. Thus, there should be a return after the 'process_' functions. Otherwise, the stall code may run twice. Fixes: 1b349f214ac7 ("usb: xhci: add 'goto' for halted endpoint check in handle_tx_event()") Reported-by: Michal Pecio <michal.pecio@gmail.com> Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20240809124408.505786-3-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-13	usb: xhci: Check for xhci->interrupters being allocated in xhci_mem_clearup()	Marc Zyngier
	If xhci_mem_init() fails, it calls into xhci_mem_cleanup() to mop up the damage. If it fails early enough, before xhci->interrupters is allocated but after xhci->max_interrupters has been set, which happens in most (all?) cases, things get uglier, as xhci_mem_cleanup() unconditionally derefences xhci->interrupters. With prejudice. Gate the interrupt freeing loop with a check on xhci->interrupters being non-NULL. Found while debugging a DMA allocation issue that led the XHCI driver on this exact path. Fixes: c99b38c41234 ("xhci: add support to allocate several interrupters") Cc: Mathias Nyman <mathias.nyman@linux.intel.com> Cc: Wesley Cheng <quic_wcheng@quicinc.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org # 6.8+ Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20240809124408.505786-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-13	Merge tag 'thunderbolt-for-v6.11-rc3' of ↵	Greg Kroah-Hartman
	ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt into usb-linus thunderbolt: Fixes for v6.11-rc3 This includes following USB4/Thunderbolt fixes for v6.11-rc3: - Fix memory leak in debugfs sideband register access - Fix hang when host router NVM is upgraded and there is another host connected. Both have been in linux-next with no reported issues. * tag 'thunderbolt-for-v6.11-rc3' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt: thunderbolt: Mark XDomain as unplugged when router is removed thunderbolt: Fix memory leaks in {port\|retimer}_sb_regs_write()
2024-08-13	platform/surface: aggregator: Fix warning when controller is destroyed in probe	Maximilian Luz
	There is a small window in ssam_serial_hub_probe() where the controller is initialized but has not been started yet. Specifically, between ssam_controller_init() and ssam_controller_start(). Any failure in this window, for example caused by a failure of serdev_device_open(), currently results in an incorrect warning being emitted. In particular, any failure in this window results in the controller being destroyed via ssam_controller_destroy(). This function checks the state of the controller and, in an attempt to validate that the controller has been cleanly shut down before we try and deallocate any resources, emits a warning if that state is not SSAM_CONTROLLER_STOPPED. However, since we have only just initialized the controller and have not yet started it, its state is SSAM_CONTROLLER_INITIALIZED. Note that this is the only point at which the controller has this state, as it will change after we start the controller with ssam_controller_start() and never revert back. Further, at this point no communication has taken place and the sender and receiver threads have not been started yet (and we may not even have an open serdev device either). Therefore, it is perfectly safe to call ssam_controller_destroy() with a state of SSAM_CONTROLLER_INITIALIZED. This, however, means that the warning currently being emitted is incorrect. Fix it by extending the check. Fixes: c167b9c7e3d6 ("platform/surface: Add Surface Aggregator subsystem") Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Link: https://lore.kernel.org/r/20240811124645.246016-1-luzmaximilian@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-08-13	char: xillybus: Don't destroy workqueue from work item running on it	Eli Billauer
	Triggered by a kref decrement, destroy_workqueue() may be called from within a work item for destroying its own workqueue. This illegal situation is averted by adding a module-global workqueue for exclusive use of the offending work item. Other work items continue to be queued on per-device workqueues to ensure performance. Reported-by: syzbot+91dbdfecdd3287734d8e@syzkaller.appspotmail.com Cc: stable <stable@kernel.org> Closes: https://lore.kernel.org/lkml/0000000000000ab25a061e1dfe9f@google.com/ Signed-off-by: Eli Billauer <eli.billauer@gmail.com> Link: https://lore.kernel.org/r/20240801121126.60183-1-eli.billauer@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-13	platform/surface: aggregator_registry: Add support for Surface Laptop 6	Maximilian Luz
	Add SAM client device nodes for the Surface Laptop Studio 6 (SL6). The SL6 is similar to the SL5, with the typical battery/AC, platform profile, and HID nodes. It also has support for the newly supported fan interface. Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Link: https://lore.kernel.org/r/20240811131948.261806-6-luzmaximilian@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-08-13	platform/surface: aggregator_registry: Add fan and thermal sensor support ↵	Maximilian Luz
	for Surface Laptop 5 The EC on the Surface Laptop 5 exposes the fan interface. With the recently introduced driver for it, we can now also enable it here. In addition, also enable the thermal sensor interface. Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Link: https://lore.kernel.org/r/20240811131948.261806-5-luzmaximilian@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-08-13	platform/surface: aggregator_registry: Add support for Surface Laptop Studio 2	Maximilian Luz
	Add SAM client device nodes for the Surface Laptop Studio 2 (SLS2). The SLS2 is quite similar to the SLS1, but it does not provide the touchpad as a SAM-HID device. Therefore, add a new node group for the SLS2 and update the comments accordingly. In addition, it uses the new fan control interface. Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Link: https://lore.kernel.org/r/20240811131948.261806-4-luzmaximilian@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-08-13	platform/surface: aggregator_registry: Add support for Surface Laptop Go 3	Maximilian Luz
	Add SAM client device nodes for the Surface Laptop Go 3. It seems to use the same SAM client devices as the Surface Laptop Go 1 and 2, so re-use their node group. Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Link: https://lore.kernel.org/r/20240811131948.261806-3-luzmaximilian@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-08-13	platform/surface: aggregator_registry: Add Support for Surface Pro 10	Maximilian Luz
	Add SAM client device nodes for the Surface Pro 10. It seems to use the same SAM client devices as the Surface Pro 9, so re-use its node group. Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Link: https://lore.kernel.org/r/20240811131948.261806-2-luzmaximilian@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-08-13	platform/x86: asus-wmi: Add quirk for ROG Ally X	Luke D. Jones
	The new ROG Ally X functions the same as the previus model so we can use the same method to ensure the MCU USB devices wake and reconnect correctly. Given that two devices marks the start of a trend, this patch also adds a quirk table to make future additions easier if the MCU is the same. Signed-off-by: Luke D. Jones <luke@ljones.dev> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20240805234603.38736-1-luke@ljones.dev Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-08-13	Merge branch 'stmmac-add-loongson-platform-support'	Paolo Abeni
	Yanteng Si says: ==================== stmmac: Add Loongson platform support v17: * As Serge's comments: Add return 0 for _dt_config(). Get back the conditional MSI-clear method execution. v16: * As Serge's comments: Move the of_node_put(plat->mdio_node) call to the DT-config/clear methods. Drop 'else if'. * Modify the commit message of 7/14. (LS2K CPU -> LS2K SOC) V15: * Drop return that will not be executed. * Move pdev from patch 12 to patch 13 to pass W=1 builds. RFC v15: * As Serge's comments: Extend the commit message.(patch 7 and patch 11) Add fixes tag for patch 8. Add loongson_dwmac_dt_clear() patch. Modify loongson_dwmac_msi_config(). ... * Pick Huacai's Acked-by tag. * Pick Serge's Reviewed-by tag. * I have already contacted the author(ZhangQing) of the module, so I copied her valid email: diasyzhang@tencent.com. Note: I replied to the comments on v14 last Sunday, but all of Loongson's email servers failed to deliver. The network administrator told me today that he has fixed the problem and re-delivered all the failed emails, but I did not see them on the mailing list. I hope they will not suddenly appear in everyone's mailbox one day. I apologize for this. (The email content mainly agrees with Serge's suggestion.) v14: Because Loongson GMAC can be also found with the 8-channels AV feature enabled, we'll need to reconsider the patches logic and thus the commit logs too. As Serge's comments and Russell's comments: [PATCH net-next v14 01/15] net: stmmac: Move the atds flag to the stmmac_dma_cfg structure [PATCH net-next v14 02/15] net: stmmac: Add multi-channel support [PATCH net-next v14 03/15] net: stmmac: Export dwmac1000_dma_ops [PATCH net-next v14 04/15] net: stmmac: dwmac-loongson: Drop duplicated hash-based filter size init [PATCH net-next v14 05/15] net: stmmac: dwmac-loongson: Drop pci_enable/disable_msi calls [PATCH net-next v14 06/15] net: stmmac: dwmac-loongson: Use PCI_DEVICE_DATA() macro for device identification [PATCH net-next v14 07/15] net: stmmac: dwmac-loongson: Detach GMAC-specific platform data init +-> Init the plat_stmmacenet_data::{tx_queues_to_use,rx_queues_to_use} in the loongson_gmac_data() method. [PATCH net-next v14 08/15] net: stmmac: dwmac-loongson: Init ref and PTP clocks rate [PATCH net-next v14 09/15] net: stmmac: dwmac-loongson: Add phy_interface for Loongson GMAC [PATCH net-next v14 10/15] net: stmmac: dwmac-loongson: Introduce PCI device info data +-> Make sure the setup() method is called after the pci_enable_device() invocation. [PATCH net-next v14 11/15] net: stmmac: dwmac-loongson: Add DT-less GMAC PCI-device support +-> Introduce the loongson_dwmac_dt_config() method here instead of doing that in a separate patch. +-> Add loongson_dwmac_acpi_config() which would just get the IRQ from the pdev->irq field and make sure it is valid. [PATCH net-next v14 12/15] net: stmmac: Fixed failure to set network speed to 1000. +-> Drop the patch as Russell's comments, At the same time, he provided another better repair suggestion, and I decided to send it separately after the patch set was merged. See: <https://lore.kernel.org/netdev/ZoW1fNqV3PxEobFx@shell.armlinux.org.uk/> [PATCH net-next v14 13/15] net: stmmac: dwmac-loongson: Add Loongson Multi-channels GMAC support +-> This is former "net: stmmac: dwmac-loongson: Add Loongson GNET support" patch, but which adds the support of the Loongson GMAC with the 8-channels AV-feature available. +-> loongson_dwmac_intx_config() shall be dropped due to the loongson_dwmac_acpi_config() method added in the PATCH 11/15. +-> Make sure loongson_data::loongson_id is initialized before the stmmac_pci_info::setup() is called. +-> Move the rx_queues_to_use/tx_queues_to_use and coe_unsupported fields initialization to the loongson_gmac_data() method. +-> As before, call the loongson_dwmac_msi_config() method if the multi-channels Loongson MAC has been detected. +-> Move everything GNET-specific to the next patch. [PATCH net-next v14 14/15] net: stmmac: dwmac-loongson: Add Loongson GNET support +-> Everything Loonsgson GNET-specific is supposed to be added in the framework of this patch: + PCI_DEVICE_ID_LOONGSON_GNET macro + loongson_gnet_fix_speed() method + loongson_gnet_data() method + loongson_gnet_pci_info data + The GNET-specific part of the loongson_dwmac_setup() method. + ... [PATCH net-next v14 15/15] net: stmmac: dwmac-loongson: Add loongson module author Other's: Pick Serge's Reviewed-by tag. v13: * Sorry, we have clarified some things in the past 10 days. I did not give you a clear reply to the following questions in v12, so I need to reply again: 1. The current LS2K2000 also have a GMAC(and two GNET) that supports 8 channels, so we have to reconsider the initialization of tx/rx_queues_to_use into probe(); 2. In v12, we disagreed on the loongson_dwmac_msi_config method, but I changed it based on Serge's comments(If I understand correctly): if (dev_of_node(&pdev->dev)) { ret = loongson_dwmac_dt_config(pdev, plat, &res); } if (ld->loongson_id == DWMAC_CORE_LS2K2000) { ret = loongson_dwmac_msi_config(pdev, plat, &res); } else { ret = loongson_dwmac_intx_config(pdev, plat, &res); } 3. Our priv->dma_cap.pcs is false, so let's use PHY_INTERFACE_MODE_NA; 4. Our GMAC does not support Delay, so let's use PHY_INTERFACE_MODE_RGMII_ID, the current dts is wrong, a fix patch will be sent to the LoongArch list later. Others: * Re-split a part of the patch (it seems we do this with every version); * Copied Serge's comments into the commit message of patch; * Fixed the stmmac_dma_operation_mode() method; * Changed some code comments. v12: * The biggest change is the re-splitting of patches. * Add a "gmac_version" in loongson_data, then we only read it once in the _probe(). * Drop Serge's patch. * Rebase to the latest code state. * Fixed the gnet commit message. v11: * Break loongson_phylink_get_caps(), fix bad logic. * Remove a unnecessary ";". * Remove some unnecessary "{}". * add a blank. * Move the code of fix _force_1000 to patch 6/6. The main changes occur in these two functions: loongson_dwmac_probe(); loongson_dwmac_setup(); v10: As Andrew's comment: * Add a #define for the 0x37. * Add a #define for Port Select. others: * Pick Serge's patch, This patch resulted from the process of reviewing our patch set. * Based on Serge's patch, modify our loongson_phylink_get_caps(). * Drop patch 3/6, we need mac_interface. * Adjusted the code layout of gnet patch. * Corrected several errata in commit message. * Move DISABLE_FORCE flag to loongson_gnet_data(). v9: We have not provided a detailed list of equipment for a long time, and I apologize for this. During this period, I have collected some information and now present it to you, hoping to alleviate the pressure of review. 1. IP core We now have two types of IP cores, one is 0x37, similar to dwmac1000; The other is 0x10. Compared to 0x37, we split several DMA registers from one to two, and it is not worth adding a new entry for this. According to Serge's comment, we made these devices work by overwriting priv->synopsys_id = 0x37 and mac->dma = <LS_dma_ops>. 1.1. Some more detailed information The number of DMA channels for 0x37 is 1; The number of DMA channels for 0x10 is 8. Except for channel 0, otherchannels do not support sending hardware checksums. Supported AV features are Qav, Qat, and Qas, and the rest are consistent with 3.73. 2. DEVICE We have two types of devices, one is GMAC, which only has a MAC chip inside and needs an external PHY chip; the other is GNET, which integrates both MAC and PHY chips inside. 2.1. Some more detailed information GMAC device: LS7A1000, LS2K1000, these devices do not support any pause mode. gnet device: LS7A2000, LS2K2000, the chip connection between the mac and phy of these devices is not normal and requires two rounds of negotiation; LS7A2000 does not support half-duplex and multi-channel; to enable multi-channel on LS2K2000, you need to turn off hardware checksum. Note: Only the LS2K2000's IP core is 0x10, while the IP cores of other devices are 0x37. 3. TABLE device type pci_id ip_core ls7a1000 gmac 7a03 0x35/0x37 ls2k1000 gmac 7a03 0x35/0x37 ls7a2000 gnet 7a13 0x37 ls2k2000 gnet 7a13 0x10 ----------------------------------------------- Changes: * passed the CI <https://github.com/linux-netdev/nipa/blob/main/tests/patch/checkpatch /checkpatch.sh> * reverse xmas tree order. * Silence build warning. * Re-split the patch. * Add more detailed commit message. * Add more code comment. * Reduce modification of generic code. * using the GNET-specific prefix. * define a new macro for the GNET MAC. * Use an easier way to overwrite mac. * Removed some useless printk. v8: * The biggest change is according to Serge's comment in the previous edition: Seeing the patch in the current state would overcomplicate the generic code and the only functions you need to update are dwmac_dma_interrupt() dwmac1000_dma_init_channel() you can have these methods re-defined with all the Loongson GNET specifics in the low-level platform driver (dwmac-loongson.c). After that you can just override the mac_device_info.dma pointer with a fixed stmmac_dma_ops descriptor. Here is what should be done for that: 1. Keep the Patch 4/9 with my comments fixed. First it will be partly useful for your GNET device. Second in general it's a correct implementation of the normal DW GMAC v3.x multi-channels feature and will be useful for the DW GMACs with that feature enabled. 2. Create the Loongson GNET-specific stmmac_dma_ops.dma_interrupt() stmmac_dma_ops.init_chan() methods in the dwmac-loongson.c driver. Don't forget to move all the Loongson-specific macros from dwmac_dma.h to dwmac-loongson.c. 3. Create a Loongson GNET-specific platform setup method with the next semantics: + allocate stmmac_dma_ops instance and initialize it with dwmac1000_dma_ops. + override the stmmac_dma_ops.{dma_interrupt, init_chan} with the pointers to the methods defined in 2. + allocate mac_device_info instance and initialize the mac_device_info.dma field with a pointer to the new stmmac_dma_ops instance. + call dwmac1000_setup() or initialize mac_device_info in a way it's done in dwmac1000_setup() (the later might be better so you wouldn't need to export the dwmac1000_setup() function). + override stmmac_priv.synopsys_id with a correct value. 4. Initialize plat_stmmacenet_data.setup() with the pointer to the method created in 3. * Others: Re-split the patch. Passed checkpatch.pl test. v7: * Refer to andrew's suggestion: - Add DMA_INTR_ENA_NIE_RX and DMA_INTR_ENA_NIE_TX #define's, etc. * Others: - Using --subject-prefix="PATCH net-next vN" to indicate that the patches are for the networking tree. - Rebase to the latest networking tree: <git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git> v6: * Refer to Serge's suggestion: - Add new platform feature flag: include/linux/stmmac.h: +#define STMMAC_FLAG_HAS_LGMAC BIT(13) - Add the IRQs macros specific to the Loongson Multi-channels GMAC: drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h: +#define DMA_INTR_ENA_NIE_LOONGSON 0x00060000 /* .../ #define DMA_INTR_ENA_NIE 0x00010000 / Normal Summary / ... - Drop all of redundant changes that don't require the prototypes being converted to accepting the stmmac_priv pointer. Refer to andrew's suggestion: - Drop white space changes. - break patch up into lots of smaller parts. Some small patches have been put into another series as a preparation see <https://lore.kernel.org/loongarch/cover.1702289232.git.siyanteng@loongson.cn/T/#t> note : This series of patches relies on the three small patches above. * others - Drop irq_flags changes. - Changed patch order. v4 -> v5: * Remove an ugly and useless patch (fix channel number). * Remove the non-standard dma64 driver code, and also remove the HWIF entries, since the associated custom callbacks no longer exist. * Refer to Serge's suggestion: Update the dwmac1000_dma.c to support the multi-DMA-channels controller setup. See: v4: <https://lore.kernel.org/loongarch/cover.1692696115.git.chenfeiyang@loongson.cn/> v3: <https://lore.kernel.org/loongarch/cover.1691047285.git.chenfeiyang@loongson.cn/> v2: <https://lore.kernel.org/loongarch/cover.1690439335.git.chenfeiyang@loongson.cn/> v1: <https://lore.kernel.org/loongarch/cover.1689215889.git.chenfeiyang@loongson.cn/> ==================== Link: https://patch.msgid.link/cover.1723014611.git.siyanteng@loongson.cn Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-13	net: stmmac: dwmac-loongson: Add loongson module author	Yanteng Si
	Add Yanteng Si as MODULE_AUTHOR of Loongson DWMAC PCI driver. Signed-off-by: Feiyang Chen <chenfeiyang@loongson.cn> Signed-off-by: Yinggang Gu <guyinggang@loongson.cn> Acked-by: Huacai Chen <chenhuacai@loongson.cn> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: Yanteng Si <siyanteng@loongson.cn> Tested-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>