summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-03-21net: phy: realtek: disable PHY-mode EEERussell King (Oracle)
Realtek RTL8211F has a "PHY-mode" EEE support which interferes with an IEEE 802.3 compliant implementation. This mode defaults to enabled, and results in the MAC receive path not seeing the link transition to LPI state. Fix this by disabling PHY-mode EEE. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/E1ttnHW-00785s-Uq@rmk-PC.armlinux.org.uk Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21net: phy: fix genphy_c45_eee_is_active() for disabled EEERussell King (Oracle)
Commit 809265fe96fe ("net: phy: c45: remove local advertisement parameter from genphy_c45_eee_is_active") stopped reading the local advertisement from the PHY earlier in this development cycle, which broke "ethtool --set-eee ethX eee off". When ethtool is used to set EEE off, genphy_c45_eee_is_active() indicates that EEE was active if the link partner reported an advertisement, which causes phylib to set phydev->enable_tx_lpi on link up, despite our local advertisement in hardware being empty. However, phydev->advertising_eee is preserved while EEE is turned off, which leads to genphy_c45_eee_is_active() incorrectly reporting that EEE is active. Fix it by checking phydev->eee_cfg.eee_enabled, and if clear, immediately indicate that EEE is not active. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/E1ttmWN-0077Mb-Q6@rmk-PC.armlinux.org.uk Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21Merge branch 'mlx5e-support-recovery-counter-in-reset'Paolo Abeni
Tariq Toukan says: ==================== mlx5e: Support recovery counter in reset This series by Yael adds a recovery counter in ethtool, for any recovery type during port reset cycle. Series starts with some cleanup and refactoring patches. New counter is added and exposed to ethtool stats in patch #4. ==================== Link: https://patch.msgid.link/1742112876-2890-1-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21net/mlx5e: Expose port reset cycle recovery counter via ethtoolYael Chemla
Display recovery event of PPCNT recovery counters group. Counts (per link) the number of total successful recovery events of any recovery types during port reset cycle. Signed-off-by: Yael Chemla <ychemla@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/1742112876-2890-5-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21net/mlx5e: Get counter group size by FW capabilityYael Chemla
Retrieve the number of fields supported by each PPCNT counter group based on the FW capability for this group. Signed-off-by: Yael Chemla <ychemla@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://patch.msgid.link/1742112876-2890-4-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21net/mlx5e: Access PHY layer counter group as other counter groupsYael Chemla
Adjust the way physical layer counters group is accessed to match the generic method used for accessing other PPCNT counter groups. Signed-off-by: Yael Chemla <ychemla@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/1742112876-2890-3-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21net/mlx5e: Ensure each counter group uses its PCAM bitYael Chemla
The code was incorrectly relying on PCAM bit of ppcnt_statistical_group for accessing per_lane_error_counters. If ppcnt_statistical_group PCAM bit was not set, we would not read per_lane_error_counters, even when its PCAM bit is set. Given the existing device capabilities, it seems to cause no harm, so this change primarily serves as cleanup. Signed-off-by: Yael Chemla <ychemla@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://patch.msgid.link/1742112876-2890-2-git-send-email-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21selftests: ublk: fix starting ublk deviceMing Lei
Firstly ublk char device node may not be created by udev yet, so wait a while until it can be opened or timeout. Secondly delete created ublk device in case of start failure, otherwise the device becomes zombie. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250321135324.259677-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-21Merge tag 'pinctrl-v6.14-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fix from Linus Walleij: - A single patch for Spacemit K1 fixing up the Kconfig to not default to "y" * tag 'pinctrl-v6.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: spacemit: PINCTRL_SPACEMIT_K1 should not default to y unconditionally
2025-03-21s390/pci: Support mmap() of PCI resources except for ISM devicesNiklas Schnelle
So far s390 does not allow mmap() of PCI resources to user-space via the usual mechanisms, though it does use it for RDMA. For the PCI sysfs resource files and /proc/bus/pci it defines neither HAVE_PCI_MMAP nor ARCH_GENERIC_PCI_MMAP_RESOURCE. For vfio-pci s390 previously relied on disabled VFIO_PCI_MMAP and now relies on setting pdev->non_mappable_bars for all devices. This is partly because access to mapped PCI resources from user-space requires special PCI load/store memory-I/O (MIO) instructions, or the special MMIO syscalls when these are not available. Still, such access is possible and useful not just for RDMA, in fact not being able to mmap() PCI resources has previously caused extra work when testing devices. One thing that doesn't work with PCI resources mapped to user-space though is the s390 specific virtual ISM device. Not only because the BAR size of 256 TiB prevents mapping the whole BAR but also because access requires use of the legacy PCI instructions which are not accessible to user-space on systems with the newer MIO PCI instructions. Now with the pdev->non_mappable_bars flag ISM can be excluded from mapping its resources while making this functionality available for all other PCI devices. To this end introduce a minimal implementation of PCI_QUIRKS and use that to set pdev->non_mappable_bars for ISM devices only. Then also set ARCH_GENERIC_PCI_MMAP_RESOURCE to take advantage of the generic implementation of pci_mmap_resource_range() enabling only the newer sysfs mmap() interface. This follows the recommendation in Documentation/PCI/sysfs-pci.rst. Link: https://lore.kernel.org/r/20250226-vfio_pci_mmap-v7-3-c5c0f1d26efd@linux.ibm.com Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-21s390/pci: Introduce pdev->non_mappable_bars and replace VFIO_PCI_MMAPNiklas Schnelle
The ability to map PCI resources to user-space is controlled by global defines. For vfio there is VFIO_PCI_MMAP which is only disabled on s390 and controls mapping of PCI resources using vfio-pci with a fallback option via the pread()/pwrite() interface. For the PCI core there is ARCH_GENERIC_PCI_MMAP_RESOURCE which enables a generic implementation for mapping PCI resources plus the newer sysfs interface. Then there is HAVE_PCI_MMAP which can be used with custom definitions of pci_mmap_resource_range() and the historical /proc/bus/pci interface. Both mechanisms are all or nothing. For s390 mapping PCI resources is possible and useful for testing and certain applications such as QEMU's vfio-pci based user-space NVMe driver. For certain devices, however access to PCI resources via mappings to user-space is not possible and these must be excluded from the general PCI resource mapping mechanisms. Introduce pdev->non_mappable_bars to indicate that a PCI device's BARs can not be accessed via mappings to user-space. In the future this enables per-device restrictions of PCI resource mapping. For now, set this flag for all PCI devices on s390 in line with the existing, general disable of PCI resource mapping. As s390 is the only user of the VFI_PCI_MMAP Kconfig options this can already be replaced with a check of this new flag. Also add similar checks in the other code protected by HAVE_PCI_MMAP respectively ARCH_GENERIC_PCI_MMAP in preparation for enabling these for supported devices. Link: https://lore.kernel.org/lkml/20250212132808.08dcf03c.alex.williamson@redhat.com/ Link: https://lore.kernel.org/r/20250226-vfio_pci_mmap-v7-2-c5c0f1d26efd@linux.ibm.com Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-21s390/pci: Fix s390_mmio_read/write syscall page fault handlingNiklas Schnelle
The s390 MMIO syscalls when using the classic PCI instructions do not cause a page fault when follow_pfnmap_start() fails due to the page not being present. Besides being a general deficiency this breaks vfio-pci's mmap() handling once VFIO_PCI_MMAP gets enabled as this lazily maps on first access. Fix this by following a failed follow_pfnmap_start() with fixup_user_page() and retrying the follow_pfnmap_start(). Also fix a VM_READ vs VM_WRITE mixup in the read syscall. Link: https://lore.kernel.org/r/20250226-vfio_pci_mmap-v7-1-c5c0f1d26efd@linux.ibm.com Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
2025-03-21PCI: Fix NULL dereference in SR-IOV VF creation error pathShay Drory
Clean up when virtfn setup fails to prevent NULL pointer dereference during device removal. The kernel oops below occurred due to incorrect error handling flow when pci_setup_device() fails. Add pci_iov_scan_device(), which handles virtfn allocation and setup and cleans up if pci_setup_device() fails, so pci_iov_add_virtfn() doesn't need to call pci_stop_and_remove_bus_device(). This prevents accessing partially initialized virtfn devices during removal. BUG: kernel NULL pointer dereference, address: 00000000000000d0 RIP: 0010:device_del+0x3d/0x3d0 Call Trace: pci_remove_bus_device+0x7c/0x100 pci_iov_add_virtfn+0xfa/0x200 sriov_enable+0x208/0x420 mlx5_core_sriov_configure+0x6a/0x160 [mlx5_core] sriov_numvfs_store+0xae/0x1a0 Link: https://lore.kernel.org/r/20250310084524.599225-1-shayd@nvidia.com Fixes: e3f30d563a38 ("PCI: Make pci_destroy_dev() concurrent safe") Signed-off-by: Shay Drory <shayd@nvidia.com> [bhelgaas: commit log, return ERR_PTR(-ENOMEM) directly] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Keith Busch <kbusch@kernel.org>
2025-03-21tracepoint: Print the function symbol when tracepoint_debug is setHuang Shijie
When tracepoint_debug is set, we may get the output in kernel log: [ 380.013843] Probe 0 : 00000000f0d68cda It is not readable, so change to print the function symbol. After this patch, the output may becomes: [ 55.225555] Probe 0 : perf_trace_sched_wakeup_template+0x0/0x20 Link: https://lore.kernel.org/20250307033858.4134-1-shijie@os.amperecomputing.com Signed-off-by: Huang Shijie <shijie@os.amperecomputing.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-21io_uring/net: only import send_zc buffer onceCaleb Sander Mateos
io_send_zc() guards its call to io_send_zc_import() with if (!done_io) in an attempt to avoid calling it redundantly on the same req. However, if the initial non-blocking issue returns -EAGAIN, done_io will stay 0. This causes the subsequent issue to unnecessarily re-import the buffer. Add an explicit flag "imported" to io_sr_msg to track if its buffer has already been imported. Clear the flag in io_send_zc_prep(). Call io_send_zc_import() and set the flag in io_send_zc() if it is unset. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Fixes: 54cdcca05abd ("io_uring/net: switch io_send() and io_send_zc() to using io_async_msghdr") Link: https://lore.kernel.org/r/20250321184819.3847386-2-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-21io_uring/cmd: introduce io_uring_cmd_import_fixed_vecPavel Begunkov
io_uring_cmd_import_fixed_vec() is a cmd helper around vectored registered buffer import functions, which caches the memory under the hood. The lifetime of the vectore and hence the iterator is bound to the request. Furthermore, the user is not allowed to call it multiple times for a single request. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/97487a80dec3fb8cf8aeedf1f9026ef6d503fe4b.1742579999.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-21io_uring/cmd: add iovec cache for commandsPavel Begunkov
Add iou_vec to commands and wire caching for it, but don't expose it to users just yet. We need the vec cleared on initial alloc, but since we can't place it at the beginning at the moment, zero the entire async_data. It's cached, and the performance effects only the initial allocation, and it might be not a bad idea since we're exposing those bits to outside drivers. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c0f2145b75791bc6106eb4e72add2cf6a2c72a7a.1742579999.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-21x86/hyperv: Add comments about hv_vpset and var size hypercall input argsMichael Kelley
Current code varies in how the size of the variable size input header for hypercalls is calculated when the input contains struct hv_vpset. Surprisingly, this variation is correct, as different hypercalls make different choices for what portion of struct hv_vpset is treated as part of the variable size input header. The Hyper-V TLFS is silent on these details, but the behavior has been confirmed with Hyper-V developers. To avoid future confusion about these differences, add comments to struct hv_vpset, and to hypercall call sites with input that contains a struct hv_vpset. The comments describe the overall situation and the calculation that should be used at each particular call site. No functional change as only comments are updated. Signed-off-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20250318214919.958953-1-mhklinux@outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250318214919.958953-1-mhklinux@outlook.com>
2025-03-21Drivers: hv: Introduce mshv_root module to expose /dev/mshv to VMMsNuno Das Neves
Provide a set of IOCTLs for creating and managing child partitions when running as root partition on Hyper-V. The new driver is enabled via CONFIG_MSHV_ROOT. A brief overview of the interface: MSHV_CREATE_PARTITION is the entry point, returning a file descriptor representing a child partition. IOCTLs on this fd can be used to map memory, create VPs, etc. Creating a VP returns another file descriptor representing that VP which in turn has another set of corresponding IOCTLs for running the VP, getting/setting state, etc. MSHV_ROOT_HVCALL is a generic "passthrough" hypercall IOCTL which can be used for a number of partition or VP hypercalls. This is for hypercalls that do not affect any state in the kernel driver, such as getting and setting VP registers and partition properties, translating addresses, etc. It is "passthrough" because the binary input and output for the hypercall is only interpreted by the VMM - the kernel driver does nothing but insert the VP and partition id where necessary (which are always in the same place), and execute the hypercall. Co-developed-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com> Signed-off-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com> Co-developed-by: Jinank Jain <jinankjain@microsoft.com> Signed-off-by: Jinank Jain <jinankjain@microsoft.com> Co-developed-by: Mukesh Rathor <mrathor@linux.microsoft.com> Signed-off-by: Mukesh Rathor <mrathor@linux.microsoft.com> Co-developed-by: Muminul Islam <muislam@microsoft.com> Signed-off-by: Muminul Islam <muislam@microsoft.com> Co-developed-by: Praveen K Paladugu <prapal@linux.microsoft.com> Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> Co-developed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Co-developed-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Roman Kisel <romank@linux.microsoft.com> Link: https://lore.kernel.org/r/1741980536-3865-11-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1741980536-3865-11-git-send-email-nunodasneves@linux.microsoft.com>
2025-03-21eth: bnxt: fix out-of-range access of vnic_info arrayTaehee Yoo
The bnxt_queue_{start | stop}() access vnic_info as much as allocated, which indicates bp->nr_vnics. So, it should not reach bp->vnic_info[bp->nr_vnics]. Fixes: 661958552eda ("eth: bnxt: do not use BNXT_VNIC_NTUPLE unconditionally in queue restart logic") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250316025837.939527-1-ap420073@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21selftests/timers: Improve skew_consistency by testing with other clockidsJohn Stultz
Lei Chen reported a bug with CLOCK_MONOTONIC_COARSE having inconsistencies when NTP is adjusting the clock frequency. This has gone seemingly undetected for ~15 years, illustrating a clear gap in our testing. The skew_consistency test is intended to catch this sort of problem, but was focused on only evaluating CLOCK_MONOTONIC, and thus missed the problem on CLOCK_MONOTONIC_COARSE. So adjust the test to run with all clockids for 60 seconds each instead of 10 minutes with just CLOCK_MONOTONIC. Reported-by: Lei Chen <lei.chen@smartx.com> Signed-off-by: John Stultz <jstultz@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250320200306.1712599-2-jstultz@google.com Closes: https://lore.kernel.org/lkml/20250310030004.3705801-1-lei.chen@smartx.com/
2025-03-21timekeeping: Fix possible inconsistencies in _COARSE clockidsJohn Stultz
Lei Chen raised an issue with CLOCK_MONOTONIC_COARSE seeing time inconsistencies. Lei tracked down that this was being caused by the adjustment tk->tkr_mono.xtime_nsec -= offset; which is made to compensate for the unaccumulated cycles in offset when the multiplicator is adjusted forward, so that the non-_COARSE clockids don't see inconsistencies. However, the _COARSE clockid getter functions use the adjusted xtime_nsec value directly and do not compensate the negative offset via the clocksource delta multiplied with the new multiplicator. In that case the caller can observe time going backwards in consecutive calls. By design, this negative adjustment should be fine, because the logic run from timekeeping_adjust() is done after it accumulated approximately multiplicator * interval_cycles into xtime_nsec. The accumulated value is always larger then the mult_adj * offset value, which is subtracted from xtime_nsec. Both operations are done together under the tk_core.lock, so the net change to xtime_nsec is always always be positive. However, do_adjtimex() calls into timekeeping_advance() as well, to to apply the NTP frequency adjustment immediately. In this case, timekeeping_advance() does not return early when the offset is smaller then interval_cycles. In that case there is no time accumulated into xtime_nsec. But the subsequent call into timekeeping_adjust(), which modifies the multiplicator, subtracts from xtime_nsec to correct for the new multiplicator. Here because there was no accumulation, xtime_nsec becomes smaller than before, which opens a window up to the next accumulation, where the _COARSE clockid getters, which don't compensate for the offset, can observe the inconsistency. To fix this, rework the timekeeping_advance() logic so that when invoked from do_adjtimex(), the time is immediately forwarded to accumulate also the sub-interval portion into xtime. That means the remaining offset becomes zero and the subsequent multiplier adjustment therefore does not modify xtime_nsec. There is another related inconsistency. If xtime is forwarded due to the instantaneous multiplier adjustment, the NTP error, which was accumulated with the previous setting, becomes meaningless. Therefore clear the NTP error as well, after forwarding the clock for the instantaneous multiplier update. Fixes: da15cfdae033 ("time: Introduce CLOCK_REALTIME_COARSE") Reported-by: Lei Chen <lei.chen@smartx.com> Signed-off-by: John Stultz <jstultz@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250320200306.1712599-1-jstultz@google.com Closes: https://lore.kernel.org/lkml/20250310030004.3705801-1-lei.chen@smartx.com/
2025-03-21mptcp: sockopt: fix getting freebind & transparentMatthieu Baerts (NGI0)
When adding a socket option support in MPTCP, both the get and set parts are supposed to be implemented. IP(V6)_FREEBIND and IP(V6)_TRANSPARENT support for the setsockopt part has been added a while ago, but it looks like the get part got forgotten. It should have been present as a way to verify a setting has been set as expected, and not to act differently from TCP or any other socket types. Everything was in place to expose it, just the last step was missing. Only new code is added to cover these specific getsockopt(), that seems safe. Fixes: c9406a23c116 ("mptcp: sockopt: add SOL_IP freebind & transparent options") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250314-net-mptcp-fix-data-stream-corr-sockopt-v1-3-122dbb249db3@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21mptcp: sockopt: fix getting IPV6_V6ONLYMatthieu Baerts (NGI0)
When adding a socket option support in MPTCP, both the get and set parts are supposed to be implemented. IPV6_V6ONLY support for the setsockopt part has been added a while ago, but it looks like the get part got forgotten. It should have been present as a way to verify a setting has been set as expected, and not to act differently from TCP or any other socket types. Not supporting this getsockopt(IPV6_V6ONLY) blocks some apps which want to check the default value, before doing extra actions. On Linux, the default value is 0, but this can be changed with the net.ipv6.bindv6only sysctl knob. On Windows, it is set to 1 by default. So supporting the get part, like for all other socket options, is important. Everything was in place to expose it, just the last step was missing. Only new code is added to cover this specific getsockopt(), that seems safe. Fixes: c9b95a135987 ("mptcp: support IPV6_V6ONLY setsockopt") Cc: stable@vger.kernel.org Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/550 Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250314-net-mptcp-fix-data-stream-corr-sockopt-v1-2-122dbb249db3@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21Merge branch 'netconsole-add-support-for-userdata-release'Paolo Abeni
Breno Leitao says: ==================== netconsole: Add support for userdata release I am submitting a series of patches that introduce a new feature for the netconsole subsystem, specifically the addition of the 'release' field to the sysdata structure. This feature allows the kernel release/version to be appended to the userdata dictionary in every message sent, enhancing the information available for debugging and monitoring purposes. This complements the already supported release prepend feature, which was added some time ago. The release prepend appends the release information at the message header, which is not ideal for two reasons: 1) It is difficult to determine if a message includes this information, making it hard and resource-intensive to parse. 2) When a message is fragmented, the release information is appended to every message fragment, consuming valuable space in the packet. The "release prepend" feature was created before the concept of userdata and sysdata. Now that this format has proven successful, we are implementing the release feature as part of this enhanced structure. This patch series aims to improve the netconsole subsystem by providing a more efficient and user-friendly way to include kernel release information in messages. I believe these changes will significantly aid in system analysis and troubleshooting. Suggested-by: Manu Bretelle <chantr4@gmail.com> Signed-off-by: Breno Leitao <leitao@debian.org> ==================== Link: https://patch.msgid.link/20250314-netcons_release-v1-0-07979c4b86af@debian.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21docs: netconsole: document release featureBreno Leitao
Add documentation explaining the kernel release auto-population feature in netconsole. This feature appends kernel version information to the userdata dictionary in every message sent when enabled via the `release_enabled` file in the configfs hierarchy. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250314-netcons_release-v1-6-07979c4b86af@debian.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21selftests: netconsole: Add tests for 'release' feature in sysdataBreno Leitao
Expands the self-tests to include the 'release' feature in sysdata. Verifies that enabling the 'release' feature appends the correct data and ensures that disabling it functions as expected. When enabled, the message should have an item similar to in the userdata: `release=$(uname -r)` Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250314-netcons_release-v1-5-07979c4b86af@debian.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21netconsole: append release to sysdataBreno Leitao
Append the init_utsname()->release to sysdata buffer before sending the message in case the feature is set. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250314-netcons_release-v1-4-07979c4b86af@debian.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21netconsole: add 'sysdata' suffix to related functionsBreno Leitao
This commit appends a common "sysdata" suffix to functions responsible for appending data to sysdata. This change enhances code clarity and prevents naming conflicts with other "append" functions, particularly in anticipation of the upcoming inclusion of the `release` field in the next patch. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250314-netcons_release-v1-3-07979c4b86af@debian.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21netconsole: implement configfs for release_enabledBreno Leitao
Implement the configfs helpers to show and set release_enabled configfs directories under userdata. When enabled, set the feature bit in netconsole_target->sysdata_fields. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250314-netcons_release-v1-2-07979c4b86af@debian.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21netconsole: introduce 'release' as a new sysdata fieldBreno Leitao
This commit adds a new feature to the sysdata structure, allowing the kernel release/version to be appended as part of sysdata. Additionally, it updates the logic to count this new field as a used entry when enabled. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250314-netcons_release-v1-1-07979c4b86af@debian.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21net: airoha: fix CONFIG_DEBUG_FS checkArnd Bergmann
The #if check causes a build failure when CONFIG_DEBUG_FS is turned off: In file included from drivers/net/ethernet/airoha/airoha_eth.c:17: drivers/net/ethernet/airoha/airoha_eth.h:543:5: error: "CONFIG_DEBUG_FS" is not defined, evaluates to 0 [-Werror=undef] 543 | #if CONFIG_DEBUG_FS | ^~~~~~~~~~~~~~~ Replace it with the correct #ifdef. Fixes: 3fe15c640f38 ("net: airoha: Introduce PPE debugfs support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20250314155009.4114308-1-arnd@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21Merge tag 'io_uring-6.14-20250321' of git://git.kernel.dk/linuxLinus Torvalds
Pull io_uring fix from Jens Axboe: "Single fix heading to stable, fixing an issue with io_req_msg_cleanup() sometimes too eagerly clearing cleanup flags" * tag 'io_uring-6.14-20250321' of git://git.kernel.dk/linux: io_uring/net: don't clear REQ_F_NEED_CLEANUP unconditionally
2025-03-21ALSA: timer: Don't take register_mutex with copy_from/to_user()Takashi Iwai
The infamous mmap_lock taken in copy_from/to_user() can be often problematic when it's called inside another mutex, as they might lead to deadlocks. In the case of ALSA timer code, the bad pattern is with guard(mutex)(&register_mutex) that covers copy_from/to_user() -- which was mistakenly introduced at converting to guard(), and it had been carefully worked around in the past. This patch fixes those pieces simply by moving copy_from/to_user() out of the register mutex lock again. Fixes: 3923de04c817 ("ALSA: pcm: oss: Use guard() for setup") Reported-by: syzbot+2b96f44164236dda0f3b@syzkaller.appspotmail.com Closes: https://lore.kernel.org/67dd86c8.050a0220.25ae54.0059.GAE@google.com Link: https://patch.msgid.link/20250321172653.14310-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-03-21MAINTAINERS: update bridge entryNikolay Aleksandrov
Roopa has decided to withdraw as a bridge maintainer and Ido has agreed to step up and co-maintain the bridge with me. He has been very helpful in bridge patch reviews and has contributed a lot to the bridge over the years. Add an entry for Roopa to CREDITS and also add bridge's headers to its MAINTAINERS entry. Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250314100631.40999-1-razor@blackwall.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21net: mctp: Remove unnecessary cast in mctp_cbHerbert Xu
The void * cast in mctp_cb is unnecessary as it's already been done at the start of the function. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/Z9PwOQeBSYlgZlHq@gondor.apana.org.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21PCI/bwctrl: Fix pcie_bwctrl_select_speed() return typeIlpo Järvinen
pcie_bwctrl_select_speed() should take __fls() of the speed bit, not return it as a raw value. Instead of directly returning 2.5GT/s speed bit, simply assign the fallback speed (2.5GT/s) into supported_speeds variable to share the normal return path that calls pcie_supported_speeds2target_speed() to calculate __fls(). This code path is not very likely to execute because pcie_get_supported_speeds() should provide valid ->supported_speeds but a spec violating device could fail to synthesize any speed in pcie_get_supported_speeds(). It could also happen in case the supported_speeds intersection is empty (also a violation of the current PCIe specs). Link: https://lore.kernel.org/r/20250321163103.5145-1-ilpo.jarvinen@linux.intel.com Fixes: de9a6c8d5dbf ("PCI/bwctrl: Add pcie_set_target_speed() to set PCIe Link Speed") Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-21Merge branch 'net-phy-remove-calls-to-devm_hwmon_sanitize_name'Paolo Abeni
Heiner Kallweit says: ==================== net: phy: remove calls to devm_hwmon_sanitize_name Since c909e68f8127 ("hwmon: (core) Use device name as a fallback in devm_hwmon_device_register_with_info") we can simply provide NULL as name argument. ==================== Link: https://patch.msgid.link/198f3cd0-6c39-4783-afe7-95576a4b8539@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21net: phy: marvell-88q2xxx: remove call to devm_hwmon_sanitize_nameHeiner Kallweit
Since c909e68f8127 ("hwmon: (core) Use device name as a fallback in devm_hwmon_device_register_with_info") we can simply provide NULL as name argument. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/59c485e4-983c-42f6-9114-916703a62e3f@gmail.com Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21net: phy: mxl-gpy: remove call to devm_hwmon_sanitize_nameHeiner Kallweit
Since c909e68f8127 ("hwmon: (core) Use device name as a fallback in devm_hwmon_device_register_with_info") we can simply provide NULL as name argument. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/e34c4802-20ce-4556-a47c-812e602e8526@gmail.com Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21net: phy: tja11xx: remove call to devm_hwmon_sanitize_nameHeiner Kallweit
Since c909e68f8127 ("hwmon: (core) Use device name as a fallback in devm_hwmon_device_register_with_info") we can simply provide NULL as name argument. Note that neither priv->hwmon_name nor priv->hwmon_dev are used outside tja11xx_hwmon_register. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/4452cb7e-1a2f-4213-b49f-9de196be9204@gmail.com Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21net: phy: realtek: remove call to devm_hwmon_sanitize_nameHeiner Kallweit
Since c909e68f8127 ("hwmon: (core) Use device name as a fallback in devm_hwmon_device_register_with_info") we can simply provide NULL as name argument. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/6e8d26f4-8d0a-4c83-aec3-378847a377eb@gmail.com Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21PCI: pciehp: Don't enable HPIE when resuming in poll modeIlpo Järvinen
PCIe hotplug can operate in poll mode without interrupt handlers using a polling kthread only. eb34da60edee ("PCI: pciehp: Disable hotplug interrupt during suspend") failed to consider that and enables HPIE (Hot-Plug Interrupt Enable) unconditionally when resuming the Port. Only set HPIE if non-poll mode is in use. This makes pcie_enable_interrupt() match how pcie_enable_notification() already handles HPIE. Link: https://lore.kernel.org/r/20250321162114.3939-1-ilpo.jarvinen@linux.intel.com Fixes: eb34da60edee ("PCI: pciehp: Disable hotplug interrupt during suspend") Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de>
2025-03-21of: address: Allow to specify nonposted-mmio per-deviceKonrad Dybcio
Certain IP blocks may strictly require/expect a nE mapping to function correctly, while others may be fine without it (which is preferred for performance reasons). Allow specifying nonposted-mmio on a per-device basis. Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250319-topic-nonposted_mmio-v1-2-dfb886fbd15f@oss.qualcomm.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2025-03-21of: address: Expand nonposted-mmio to non-Apple Silicon platformsKonrad Dybcio
The nE memory attribute may be utilized by various implementations, not limited to Apple Silicon platforms. Drop the early CONFIG_ARCH_APPLE check. Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250319-topic-nonposted_mmio-v1-1-dfb886fbd15f@oss.qualcomm.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2025-03-21docs: dt-bindings: Specify ordering for properties within groupsDragan Simic
Ordering of the individual properties inside each property group benefits from applying natural sort order [1] by the property names, because it results in more logical and more usable property lists, similarly to what's already the case with the alpha-numerical ordering of the nodes without unit addresses. Let's have this clearly specified in the DTS coding style, and let's expand the provided node example a bit, to actually show the results of applying natural sort order. Applying strict alpha-numerical ordering can result in property lists that are suboptimal from the usability standpoint. For the provided example, which stems from a real-world DT, [2][3][4] applying strict alpha-numerical ordering produces the following undesirable result: vdd-0v9-supply = <&board_vreg1>; vdd-12v-supply = <&board_vreg3>; vdd-1v8-supply = <&board_vreg4>; vdd-3v3-supply = <&board_vreg2>; Having the properties sorted in natural order by their associated voltages is more logical, more usable, and a bit more consistent. [1] https://en.wikipedia.org/wiki/Natural_sort_order [2] https://lore.kernel.org/linux-rockchip/b39cfd7490d8194f053bf3971f13a43472d1769e.1740941097.git.dsimic@manjaro.org/ [3] https://lore.kernel.org/linux-rockchip/174104113599.8946.16805724674396090918.b4-ty@sntech.de/ [4] https://lore.kernel.org/linux-rockchip/757afa87255212dfa5abf4c0e31deb08@manjaro.org/ Signed-off-by: Dragan Simic <dsimic@manjaro.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/6468619098f94d8acb00de0431c414c5fcfbbdbf.1742532899.git.dsimic@manjaro.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2025-03-21drm/amd/pm: Update feature list for smu_v13_0_6Asad Kamal
Update feature list for smu_v13_0_6 to show vcn & smu deep sleep feature enable status Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu: Add parameter documentation for amdgpu_sync_fenceSrinivasan Shanmugam
The 'flags' parameter, which specifies memory allocation behavior while creating a sync entry, Fixes the below with gcc W=1: drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c:162: warning: Function parameter or struct member 'flags' not described in 'amdgpu_sync_fence' Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu/discovery: optionally use fw based ip discoveryAlex Deucher
On chips without native IP discovery support, use the fw binary if available, otherwise we can continue without it. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Flora Cui <flora.cui@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu/discovery: use specific ip_discovery.bin for legacy asicsFlora Cui
vega10/vega12/vega20/raven/raven2/picasso/arcturus/aldebaran Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>