summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-01-14ice: Add in/out PTP pin delaysKarol Kolacinski
HW can have different input/output delays for each of the pins. Currently, only E82X adapters have delay compensation based on TSPLL config and E810 adapters have constant 1 ms compensation, both cases only for output delays and the same one for all pins. E825 adapters have different delays for SDP and other pins. Those delays are also based on direction and input delays are different than output ones. This is the main reason for moving delays to pin description structure. Add a field in ice_ptp_pin_desc structure to reflect that. Delay values are based on approximate calculations of HW delays based on HW spec. Implement external timestamp (input) delay compensation. Remove existing definitions and wrappers for periodic output propagation delays. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-14ice: implement low latency PHY timer updatesJacob Keller
Programming the PHY registers in preparation for an increment value change or a timer adjustment on E810 requires issuing Admin Queue commands for each PHY register. It has been found that the firmware Admin Queue processing occasionally has delays of tens or rarely up to hundreds of milliseconds. This delay cascades to failures in the PTP applications which depend on these updates being low latency. Consider a standard PTP profile with a sync rate of 16 times per second. This means there is ~62 milliseconds between sync messages. A complete cycle of the PTP algorithm 1) Sync message (with Tx timestamp) from source 2) Follow-up message from source 3) Delay request (with Tx timestamp) from sink 4) Delay response (with Rx timestamp of request) from source 5) measure instantaneous clock offset 6) request time adjustment via CLOCK_ADJTIME systemcall The Tx timestamps have a default maximum timeout of 10 milliseconds. If we assume that the maximum possible time is used, this leaves us with ~42 milliseconds of processing time for a complete cycle. The CLOCK_ADJTIME system call is synchronous and will block until the driver completes its timer adjustment or frequency change. If the writes to prepare the PHY timers get hit by a latency spike of 50 milliseconds, then the PTP application will be delayed past the point where the next cycle should start. Packets from the next cycle may have already arrived and are waiting on the socket. In particular, LinuxPTP ptp4l may start complaining about missing an announce message from the source, triggering a fault. In addition, the clockcheck logic it uses may trigger. This clockcheck failure occurs because the timestamp captured by hardware is compared against a reading of CLOCK_MONOTONIC. It is assumed that the time when the Rx timestamp is captured and the read from CLOCK_MONOTONIC are relatively close together. This is not the case if there is a significant delay to processing the Rx packet. Newer firmware supports programming the PHY registers over a low latency interface which bypasses the Admin Queue. Instead, software writes to the REG_LL_PROXY_L and REG_LL_PROXY_H registers. Firmware reads these registers and then programs the PHY timers. Implement functions to use this interface when available to program the PHY timers instead of using the Admin Queue. This avoids the Admin Queue latency and ensures that adjustments happen within acceptable latency bounds. Co-developed-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Anton Nadezhdin <anton.nadezhdin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-14ice: check low latency PHY timer update firmware capabilityJacob Keller
Newer versions of firmware support programming the PHY timer via the low latency interface exposed over REG_LL_PROXY_L and REG_LL_PROXY_H. Add support for checking the device capabilities for this feature. Co-developed-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Anton Nadezhdin <anton.nadezhdin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-14ice: add lock to protect low latency interfaceJacob Keller
Newer firmware for the E810 devices support a 'low latency' interface to interact with the PHY without using the Admin Queue. This is interacted with via the REG_LL_PROXY_L and REG_LL_PROXY_H registers. Currently, this interface is only used for Tx timestamps. There are two different mechanisms, including one which uses an interrupt for firmware to signal completion. However, these two methods are mutually exclusive, so no synchronization between them was necessary. This low latency interface is being extended in future firmware to support also programming the PHY timers. Use of the interface for PHY timers will need synchronization to ensure there is no overlap with a Tx timestamp. The interrupt-based response complicates the locking somewhat. We can't use a simple spinlock. This would require being acquired in ice_ptp_req_tx_single_tstamp, and released in ice_ptp_complete_tx_single_tstamp. The ice_ptp_req_tx_single_tstamp function is called from the threaded IRQ, and the ice_ptp_complete_tx_single_stamp is called from the low latency IRQ, so we would need to acquire the lock with IRQs disabled. To handle this, we'll use a wait queue along with wait_event_interruptible_locked_irq in the update flows which don't use the interrupt. The interrupt flow will acquire the wait queue lock, set the ATQBAL_FLAGS_INTR_IN_PROGRESS, and then initiate the firmware low latency request, and unlock the wait queue lock. Upon receipt of the low latency interrupt, the lock will be acquired, the ATQBAL_FLAGS_INTR_IN_PROGRESS bit will be cleared, and the firmware response will be captured, and wake_up_locked() will be called on the wait queue. The other flows will use wait_event_interruptible_locked_irq() to wait until the ATQBAL_FLAGS_INTR_IN_PROGRESS is clear. This function checks the condition under lock, but does not hold the lock while waiting. On return, the lock is held, and a return of zero indicates we hold the lock and the in-progress flag is not set. This will ensure that threads which need to use the low latency interface will sleep until they can acquire the lock without any pending low latency interrupt flow interfering. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Anton Nadezhdin <anton.nadezhdin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-14ice: rename TS_LL_READ* macros to REG_LL_PROXY_H_*Jacob Keller
The TS_LL_READ macros are used as part of the low latency Tx timestamp interface. A future firmware extension will add support for performing PHY timer updates over this interface. Using TS_LL_READ as the prefix for these macros will be confusing once the interface is used for other purposes. Rename the macros, using the prefix REG_LL_PROXY_H, to better clarify that this is for the low latency interface. Additionally add macros for PF_SB_ATQBAH and PF_SB_ATQBAL registers to better clarify content of this registers as PF_SB_ATQBAH contain low part of Tx timestamp Co-developed-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Anton Nadezhdin <anton.nadezhdin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-14ice: use read_poll_timeout_atomic in ice_read_phy_tstamp_ll_e810Jacob Keller
The ice_read_phy_tstamp_ll_e810 function repeatedly reads the PF_SB_ATQBAL register until the TS_LL_READ_TS bit is cleared. This is a perfect candidate for using rd32_poll_timeout. However, the default implementation uses a sleep-based wait. Use read_poll_timeout_atomic macro which is based on the non-sleeping implementation and use it to replace the loop reading in the ice_read_phy_tstamp_ll_e810 function. Co-developed-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Anton Nadezhdin <anton.nadezhdin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-14ice: use string choice helpersR Sundar
Use string choice helpers for better readability. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Julia Lawall <julia.lawall@inria.fr> Closes: https://lore.kernel.org/r/202410121553.SRNFzc2M-lkp@intel.com/ Signed-off-by: R Sundar <prosunofficial@gmail.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-14ice: add fw and port health reportersKonrad Knitter
Firmware generates events for global events or port specific events. Driver shall subscribe for health status events from firmware on supported FW versions >= 1.7.6. Driver shall expose those under specific health reporter, two new reporters are introduced: - FW health reporter shall represent global events (problems with the image, recovery mode); - Port health reporter shall represent port-specific events (module failure). Firmware only reports problems when those are detected, it does not store active fault list. Driver will hold only last global and last port-specific event. Driver will report all events via devlink health report, so in case of multiple events of the same source they can be reviewed using devlink autodump feature. $ devlink health pci/0000:b1:00.3: reporter fw state healthy error 0 recover 0 auto_dump true reporter port state error error 1 recover 0 last_dump_date 2024-03-17 last_dump_time 09:29:29 auto_dump true $ devlink health diagnose pci/0000:b1:00.3 reporter port Syndrome: 262 Description: Module is not present. Possible Solution: Check that the module is inserted correctly. Port Number: 0 Tested on Intel Corporation Ethernet Controller E810-C for SFP Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Co-developed-by: Sharon Haroni <sharon.haroni@intel.com> Signed-off-by: Sharon Haroni <sharon.haroni@intel.com> Co-developed-by: Nicholas Nunley <nicholas.d.nunley@intel.com> Signed-off-by: Nicholas Nunley <nicholas.d.nunley@intel.com> Co-developed-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Konrad Knitter <konrad.knitter@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-14ice: add recipe priority check in searchMichal Swiatkowski
The new recipe should be added even if exactly the same recipe already exists with different priority. Example use case is when the rule is being added from TC tool context. It should has the highest priority, but if the recipe already exists the rule will inherit it priority. It can lead to the situation when the rule added from TC tool has lower priority than expected. The solution is to check the recipe priority when trying to find existing one. Previous recipe is still useful. Example: RID 8 -> priority 4 RID 10 -> priority 7 The difference is only in priority rest is let's say eth + mac + direction. Adding ARP + MAC_A + RX on RID 8, forward to VF0_VSI After that IP + MAC_B + RX on RID 10 (from TC tool), forward to PF0 Both will work. In case of adding ARP + MAC_A + RX on RID 8, forward to VF0_VSI ARP + MAC_A + RX on RID 10, forward to PF0. Only second one will match, but this is expected. Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-14ice: ice_probe: init ice_adapter after HW initPrzemek Kitszel
Move ice_adapter initialization to be after HW init, so it could use HW capabilities, like number of PFs. This is needed for devlink-resource based RSS LUT size management for PF/VF (not in this series). Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-14ice: minor: rename goto labels from err to unrollPrzemek Kitszel
Clean up goto labels after previous commit, to conform to single naming scheme in ice_probe() and ice_init_dev(). Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-14ice: split ice_init_hw() out from ice_init_dev()Przemek Kitszel
Split ice_init_hw() call out from ice_init_dev(). Such move enables pulling the former to be even earlier on call path, what would enable moving ice_adapter init to be between the two (in subsequent commit). Such move enables ice_adapter to know about number of PFs. Do the same for ice_deinit_hw(), so the init and deinit calls could be easily mirrored. Next commit will rename unrelated goto labels to unroll prefix. Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-14ice: c827: move wait for FW to ice_init_hw()Przemek Kitszel
Move call to ice_wait_for_fw() from ice_init_dev() into ice_init_hw(), where it fits better. This requires also to move ice_wait_for_fw() to ice_common.c. ice_is_pf_c827() is now used only in ice_common.c, so it could be static. CC: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-01-14MAINTAINERS: downgrade Ethernet NIC drivers without CI reportingJakub Kicinski
Per previous change downgrade all NIC drivers (discrete, embedded, SoC components, virtual) which don't report test results to CI from Supported to Maintained. Also include all components or building blocks of NIC drivers (separate entries for "shared" code, subsystem support like PTP or entries for specific offloads etc.) Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20250111024359.3678956-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14docs: netdev: document requirements for Supported statusJakub Kicinski
As announced back in April, require running upstream tests to maintain Supported status for NIC drivers: https://lore.kernel.org/20240425114200.3effe773@kernel.org Multiple vendors have been "working on it" for months. Let's make it official. Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250111024359.3678956-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14x86/fred: Fix the FRED RSP0 MSR out of sync with its per-CPU cacheXin Li (Intel)
The FRED RSP0 MSR is only used for delivering events when running userspace. Linux leverages this property to reduce expensive MSR writes and optimize context switches. The kernel only writes the MSR when about to run userspace *and* when the MSR has actually changed since the last time userspace ran. This optimization is implemented by maintaining a per-CPU cache of FRED RSP0 and then checking that against the value for the top of current task stack before running userspace. However cpu_init_fred_exceptions() writes the MSR without updating the per-CPU cache. This means that the kernel might return to userspace with MSR_IA32_FRED_RSP0==0 when it needed to point to the top of current task stack. This would induce a double fault (#DF), which is bad. A context switch after cpu_init_fred_exceptions() can paper over the issue since it updates the cached value. That evidently happens most of the time explaining how this bug got through. Fix the bug through resynchronizing the FRED RSP0 MSR with its per-CPU cache in cpu_init_fred_exceptions(). Fixes: fe85ee391966 ("x86/entry: Set FRED RSP0 on return to userspace instead of context switch") Signed-off-by: Xin Li (Intel) <xin@zytor.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Cc:stable@vger.kernel.org Link: https://lore.kernel.org/all/20250110174639.1250829-1-xin%40zytor.com
2025-01-14Merge tag 'seccomp-v6.13-rc8' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp fix from Kees Cook: "Fix a randconfig failure: - Unconditionally define stub for !CONFIG_SECCOMP (Linus Walleij)" * tag 'seccomp-v6.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: seccomp: Stub for !CONFIG_SECCOMP
2025-01-14Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Fix E825 initialization Grzegorz Nitka says: E825 products have incorrect initialization procedure, which may lead to initialization failures and register values. Fix E825 products initialization by adding correct sync delay, checking the PHY revision only for current PHY and adding proper destination device when reading port/quad. In addition, E825 uses PF ID for indexing per PF registers and as a primary PHY lane number, which is incorrect. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: Add correct PHY lane assignment ice: Fix ETH56G FC-FEC Rx offset value ice: Fix quad registers read on E825 ice: Fix E825 initialization ==================== Link: https://patch.msgid.link/20250113182840.3564250-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14Merge branch 'mptcp-fixes-for-connect-selftest-flakes'Jakub Kicinski
Matthieu Baerts says: ==================== mptcp: fixes for connect selftest flakes Last week, Jakub reported [1] that the MPTCP Connect selftest was unstable. It looked like it started after the introduction of some fixes [2]. After analysis from Paolo, these patches revealed existing bugs, that should be fixed by the following patches. - Patch 1: Make sure ACK are sent when MPTCP-level window re-opens. In some corner cases, the other peer was not notified when more data could be sent. A fix for v5.11, but depending on a feature introduced in v5.19. - Patch 2: Fix spurious wake-up under memory pressure. In this situation, the userspace could be invited to read data not being there yet. A fix for v6.7. - Patch 3: Fix a false positive error when running the MPTCP Connect selftest with the "disconnect" cases. The userspace could disconnect the socket too soon, which would reset (MP_FASTCLOSE) the connection, interpreted as an error by the test. A fix for v5.17. Link: https://lore.kernel.org/20250107131845.5e5de3c5@kernel.org [1] Link: https://lore.kernel.org/20241230-net-mptcp-rbuf-fixes-v1-0-8608af434ceb@kernel.org [2] ==================== Link: https://patch.msgid.link/20250113-net-mptcp-connect-st-flakes-v1-0-0d986ee7b1b6@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14selftests: mptcp: avoid spurious errors on disconnectPaolo Abeni
The disconnect test-case generates spurious errors: INFO: disconnect INFO: extra options: -I 3 -i /tmp/tmp.r43niviyoI 01 ns1 MPTCP -> ns1 (10.0.1.1:10000 ) MPTCP (duration 140ms) [FAIL] file received by server does not match (in, out): Unexpected revents: POLLERR/POLLNVAL(19) -rw-r--r-- 1 root root 10028676 Jan 10 10:47 /tmp/tmp.r43niviyoI.disconnect Trailing bytes are: ��\����R���!8��u2��5N% -rw------- 1 root root 9992290 Jan 10 10:47 /tmp/tmp.Os4UbnWbI1 Trailing bytes are: ��\����R���!8��u2��5N% 02 ns1 MPTCP -> ns1 (dead:beef:1::1:10001) MPTCP (duration 206ms) [ OK ] 03 ns1 MPTCP -> ns1 (dead:beef:1::1:10002) TCP (duration 31ms) [ OK ] 04 ns1 TCP -> ns1 (dead:beef:1::1:10003) MPTCP (duration 26ms) [ OK ] [FAIL] Tests of the full disconnection have failed Time: 2 seconds The root cause is actually in the user-space bits: the test program currently disconnects as soon as all the pending data has been spooled, generating an FASTCLOSE. If such option reaches the peer before the latter has reached the closed status, the msk socket will report an error to the user-space, as per protocol specification, causing the above failure. Address the issue explicitly waiting for all the relevant sockets to reach a closed status before performing the disconnect. Fixes: 05be5e273c84 ("selftests: mptcp: add disconnect tests") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250113-net-mptcp-connect-st-flakes-v1-3-0d986ee7b1b6@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14mptcp: fix spurious wake-up on under memory pressurePaolo Abeni
The wake-up condition currently implemented by mptcp_epollin_ready() is wrong, as it could mark the MPTCP socket as readable even when no data are present and the system is under memory pressure. Explicitly check for some data being available in the receive queue. Fixes: 5684ab1a0eff ("mptcp: give rcvlowat some love") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250113-net-mptcp-connect-st-flakes-v1-2-0d986ee7b1b6@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14mptcp: be sure to send ack when mptcp-level window re-opensPaolo Abeni
mptcp_cleanup_rbuf() is responsible to send acks when the user-space reads enough data to update the receive windows significantly. It tries hard to avoid acquiring the subflow sockets locks by checking conditions similar to the ones implemented at the TCP level. To avoid too much code duplication - the MPTCP protocol can't reuse the TCP helpers as part of the relevant status is maintained into the msk socket - and multiple costly window size computation, mptcp_cleanup_rbuf uses a rough estimate for the most recently advertised window size: the MPTCP receive free space, as recorded as at last-ack time. Unfortunately the above does not allow mptcp_cleanup_rbuf() to detect a zero to non-zero win change in some corner cases, skipping the tcp_cleanup_rbuf call and leaving the peer stuck. After commit ea66758c1795 ("tcp: allow MPTCP to update the announced window"), MPTCP has actually cheap access to the announced window value. Use it in mptcp_cleanup_rbuf() for a more accurate ack generation. Fixes: e3859603ba13 ("mptcp: better msk receive window updates") Cc: stable@vger.kernel.org Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://lore.kernel.org/20250107131845.5e5de3c5@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250113-net-mptcp-connect-st-flakes-v1-1-0d986ee7b1b6@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14Bluetooth: iso: Allow BIG re-syncIulia Tanasescu
A Broadcast Sink might require BIG sync to be terminated and re-established multiple times, while keeping the same PA sync handle active. This can be possible if the configuration of the listening (PA sync) socket is reset once all bound BISes are established and accepted by the user space: 1. The DEFER setup flag needs to be reset on the parent socket, to allow another BIG create sync procedure to be started on socket read. 2. The BT_SK_BIG_SYNC flag needs to be cleared on the parent socket, to allow another BIG create sync command to be sent. 3. The socket state needs to transition from BT_LISTEN to BT_CONNECTED, to mark that the listening process has completed and another one can be started if needed. Signed-off-by: Iulia Tanasescu <iulia.tanasescu@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-01-14Merge branch 'tcp-add-a-new-paws_ack-drop-reason'Jakub Kicinski
Eric Dumazet says: ==================== tcp: add a new PAWS_ACK drop reason Current TCP_RFC7323_PAWS drop reason is too generic and can cause confusion. One common source for these drops are ACK packets coming too late. A prior packet with payload already changed tp->rcv_nxt. Add TCP_RFC7323_PAWS_ACK new drop reason, and do not generate a DUPACK for such old ACK. ==================== Link: https://patch.msgid.link/20250113135558.3180360-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14tcp: add LINUX_MIB_PAWS_OLD_ACK SNMP counterEric Dumazet
Prior patch in the series added TCP_RFC7323_PAWS_ACK drop reason. This patch adds the corresponding SNMP counter, for folks using nstat instead of tracing for TCP diagnostics. nstat -az | grep PAWSOldAck Suggested-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Tested-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250113135558.3180360-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14tcp: add TCP_RFC7323_PAWS_ACK drop reasonEric Dumazet
XPS can cause reorders because of the relaxed OOO conditions for pure ACK packets. For hosts not using RFS, what can happpen is that ACK packets are sent on behalf of the cpu processing NIC interrupts, selecting TX queue A for ACK packet P1. Then a subsequent sendmsg() can run on another cpu. TX queue selection uses the socket hash and can choose another queue B for packets P2 (with payload). If queue A is more congested than queue B, the ACK packet P1 could be sent on the wire after P2. A linux receiver when processing P1 (after P2) currently increments LINUX_MIB_PAWSESTABREJECTED (TcpExtPAWSEstab) and use TCP_RFC7323_PAWS drop reason. It might also send a DUPACK if not rate limited. In order to better understand this pattern, this patch adds a new drop_reason : TCP_RFC7323_PAWS_ACK. For old ACKS like these, we no longer increment LINUX_MIB_PAWSESTABREJECTED and no longer sends a DUPACK, keeping credit for other more interesting DUPACK. perf record -e skb:kfree_skb -a perf script ... swapper 0 [148] 27475.438637: skb:kfree_skb: ... location=tcp_validate_incoming+0x4f0 reason: TCP_RFC7323_PAWS_ACK swapper 0 [208] 27475.438706: skb:kfree_skb: ... location=tcp_validate_incoming+0x4f0 reason: TCP_RFC7323_PAWS_ACK swapper 0 [208] 27475.438908: skb:kfree_skb: ... location=tcp_validate_incoming+0x4f0 reason: TCP_RFC7323_PAWS_ACK swapper 0 [148] 27475.439010: skb:kfree_skb: ... location=tcp_validate_incoming+0x4f0 reason: TCP_RFC7323_PAWS_ACK swapper 0 [148] 27475.439214: skb:kfree_skb: ... location=tcp_validate_incoming+0x4f0 reason: TCP_RFC7323_PAWS_ACK swapper 0 [208] 27475.439286: skb:kfree_skb: ... location=tcp_validate_incoming+0x4f0 reason: TCP_RFC7323_PAWS_ACK ... Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://patch.msgid.link/20250113135558.3180360-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14tcp: add drop_reason support to tcp_disordered_ack()Eric Dumazet
Following patch is adding a new drop_reason to tcp_validate_incoming(). Change tcp_disordered_ack() to not return a boolean anymore, but a drop reason. Change its name to tcp_disordered_ack_check() Refactor tcp_validate_incoming() to ease the code review of the following patch, and reduce indentation level. This patch is a refactor, with no functional change. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://patch.msgid.link/20250113135558.3180360-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14net: phy: dp83822: Fix typo "outout" -> "output"Colin Ian King
There is a typo in a phydev_err message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250113091555.23594-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14wifi: ath12k: fix key cache handlingAditya Kumar Singh
Currently, an interface is created in the driver during channel assignment. If mac80211 attempts to set a key for an interface before this assignment, the driver caches the key. Once the interface is created, the driver installs the cached key to the hardware. This sequence is exemplified in mesh mode operation where the group key is set before channel assignment. However, in ath12k_mac_update_key_cache(), after caching the key, due to incorrect logic, it is deleted from the cache during the subsequent loop iteration. As a result, after the interface is created, the driver does not find any cached key, and the key is not installed to the hardware which is wrong. This leads to issue in mesh, where broadcast traffic is not encrypted over the air. Fix this issue by adjusting the logic of ath12k_mac_update_key_cache() properly. Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1 Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3-03253.1-QCAHKSWPL_SILICONZ-29 # Nicolas Escande <nico.escande@gmail.com> Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1 # Nicolas Escande <nico.escande@gmail.com> Fixes: 25e18b9d6b4b ("wifi: ath12k: modify ath12k_mac_op_set_key() for MLO") Signed-off-by: Aditya Kumar Singh <quic_adisi@quicinc.com> Acked-by: Kalle Valo <kvalo@kernel.org> Tested-by: Nicolas Escande <nico.escande@gmail.com> Link: https://patch.msgid.link/20250112-fix_key_cache_handling-v2-1-70e142c6153e@quicinc.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
2025-01-14wifi: ath12k: Fix uninitialized variable access in ath12k_mac_allocate() ↵Karthikeyan Periyasamy
function Currently, the uninitialized variable 'ab' is accessed in the ath12k_mac_allocate() function. Initialize 'ab' with the first radio device present in the hardware abstraction handle (ah). Additionally, move the default setting procedure from the pdev mapping iteration to the total radio calculating iteration for better code readability. Perform the maximum radio validation check for total_radio to ensure that both num_hw and radio_per_hw are validated indirectly, as these variables are derived from total_radio. This also fixes the below Smatch static checker warning. Smatch warning: ath12k_mac_allocate() error: uninitialized symbol 'ab' Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1 Fixes: a343d97f27f5 ("wifi: ath12k: move struct ath12k_hw from per device to group") Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com> Acked-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com> Acked-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/20250112071630.4059410-5-quic_periyasa@quicinc.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
2025-01-14wifi: ath12k: Remove ath12k_get_num_hw() helper functionKarthikeyan Periyasamy
Currently, the ath12k_get_num_hw() helper function takes the device handle as an argument. Here, the number of hardware is retrieved from the group handle. Demanding the device handle from the caller is unnecessary since in some cases the group handle is already available. Additionally, there is no longer a need for multiple indirections to get the number of hardware. Therefore, remove this helper function and directly use ag->num_hw. This change also fixes the below Smatch static checker warning. Smatch warning: ath12k_mac_destroy() error: we previously assumed 'ab' could be null Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1 Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/ath12k/3e705de0-67d1-4437-97ff-4828d83ae2af@stanley.mountain/ Closes: https://scan7.scan.coverity.com/#/project-view/52682/11354?selectedIssue=1602340 Fixes: a343d97f27f5 ("wifi: ath12k: move struct ath12k_hw from per device to group") Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com> Acked-by: Kalle Valo <kvalo@kernel.org> Acked-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com> Link: https://patch.msgid.link/20250112071630.4059410-4-quic_periyasa@quicinc.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
2025-01-14wifi: ath12k: Refactor the ath12k_hw get helper function argumentKarthikeyan Periyasamy
Currently, ath12k_hw is placed inside the ath12k_hw_group. However, the ath12k_hw get helper function takes the device handle and the index as parameters. Here, the index parameter is specific to the group handle. Therefore, change this helper function argument from the device handle to the group handle. Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1 Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com> Acked-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com> Acked-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/20250112071630.4059410-3-quic_periyasa@quicinc.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
2025-01-14wifi: ath12k: Refactor ath12k_hw set helper function argumentKarthikeyan Periyasamy
Currently, ath12k_hw is placed inside the ath12k_hw_group. However, the ath12k_hw set helper function takes the device handle and the index as parameters. Here, the index parameter is specific to the group handle. Therefore, change this helper function argument from the device handle to the group handle. Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1 Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com> Acked-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com> Acked-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/20250112071630.4059410-2-quic_periyasa@quicinc.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
2025-01-14drm/v3d: Ensure job pointer is set to NULL after job completionMaíra Canal
After a job completes, the corresponding pointer in the device must be set to NULL. Failing to do so triggers a warning when unloading the driver, as it appears the job is still active. To prevent this, assign the job pointer to NULL after completing the job, indicating the job has finished. Fixes: 14d1d1908696 ("drm/v3d: Remove the bad signaled() implementation.") Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250113154741.67520-1-mcanal@igalia.com
2025-01-14cpufreq: Move endif to the end of Kconfig fileViresh Kumar
It is possible to enable few cpufreq drivers, without the framework being enabled. This happened due to a bug while moving the entries earlier. Fix it. Fixes: 7ee1378736f0 ("cpufreq: Move CPPC configs to common Kconfig and add RISC-V") Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Pierre Gondois <pierre.gondois@arm.com> Reviewed-by: Sunil V L <sunilvl@ventanamicro.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://patch.msgid.link/84ac7a8fa72a8fe20487bb0a350a758bce060965.1736488384.git.viresh.kumar@linaro.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-01-14Merge tag 'pci-v6.13-fixes-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fix from Bjorn Helgaas: - Prevent bwctrl NULL pointer dereference that caused hangs on shutdown on ASUS ROG Strix SCAR 17 G733PYV (Lukas Wunner) * tag 'pci-v6.13-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: PCI/bwctrl: Fix NULL pointer deref on unbind and bind
2025-01-14Merge branch 'mlx5-next' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Tariq Toukan says: ==================== mlx5-next updates 2025-01-14 The following pull-request contains mlx5 IFC updates. * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Add nic_cap_reg and vhca_icm_ctrl registers net/mlx5: SHAMPO: Introduce new SHAMPO specific HCA caps net/mlx5: Add support for MRTCQ register net/mlx5: Update mlx5_ifc to support FEC for 200G per lane link modes ==================== Link: https://patch.msgid.link/20250114055700.1928736-1-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "One iscsi driver fix and one core fix. The core fix is an important one because a retry efficiency update is now causing some USB devices to get the wrong size on discovery (it upset their retry logic for READ_CAPACITY_16)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: iscsi: Fix redundant response for ISCSI_UEVENT_GET_HOST_STATS request scsi: core: Fix command pass through retry regression
2025-01-14drm/vmwgfx: Add new keep_resv BO paramIan Forbes
Adds a new BO param that keeps the reservation locked after creation. This removes the need to re-reserve the BO after creation which is a waste of cycles. This also fixes a bug in vmw_prime_import_sg_table where the imported reservation is unlocked twice. Signed-off-by: Ian Forbes <ian.forbes@broadcom.com> Fixes: b32233acceff ("drm/vmwgfx: Fix prime import/export") Reviewed-by: Zack Rusin <zack.rusin@broadcom.com> Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250110185335.15301-1-ian.forbes@broadcom.com
2025-01-14drm/vmwgfx: Remove busy_placesIan Forbes
Unused since commit a78a8da51b36 ("drm/ttm: replace busy placement with flags v6") Signed-off-by: Ian Forbes <ian.forbes@broadcom.com> Reviewed-by: Martin Krastev <martin.krastev@broadcom.com> Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250108201355.2521070-1-ian.forbes@broadcom.com
2025-01-14drm/vmwgfx: Unreserve BO on errorIan Forbes
Unlock BOs in reverse order. Add an acquire context so that lockdep doesn't complain. Fixes: d6667f0ddf46 ("drm/vmwgfx: Fix handling of dumb buffers") Signed-off-by: Ian Forbes <ian.forbes@broadcom.com> Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241210195535.2074918-1-ian.forbes@broadcom.com
2025-01-14io_uring/rsrc: require cloned buffers to share accounting contextsJann Horn
When IORING_REGISTER_CLONE_BUFFERS is used to clone buffers from uring instance A to uring instance B, where A and B use different MMs for accounting, the accounting can go wrong: If uring instance A is closed before uring instance B, the pinned memory counters for uring instance B will be decremented, even though the pinned memory was originally accounted through uring instance A; so the MM of uring instance B can end up with negative locked memory. Cc: stable@vger.kernel.org Closes: https://lore.kernel.org/r/CAG48ez1zez4bdhmeGLEFxtbFADY4Czn3CV0u9d_TMcbvRA01bg@mail.gmail.com Fixes: 7cc2a6eadcd7 ("io_uring: add IORING_REGISTER_COPY_BUFFERS method") Signed-off-by: Jann Horn <jannh@google.com> Link: https://lore.kernel.org/r/20250114-uring-check-accounting-v1-1-42e4145aa743@google.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-01-14Merge tag 'sound-6.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Hopefully the last PR for 6.13. This became bigger than wished due to the timing after holiday breaks. The only large LOC is the additional document for Cirrus codec which is nice for users (and absolutely safe). All the rest are small fixes in ASoC Rcar and codecs as well as HD-audio quirks (And no fix for USB guitar pedals seen yet :)" * tag 'sound-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/realtek: Fix volume adjustment issue on Lenovo ThinkBook 16P Gen5 ALSA: hda/realtek: fixup ASUS H7606W ALSA: hda/realtek: fixup ASUS GA605W ALSA: hda/realtek: Add support for Ayaneo System using CS35L41 HDA ASoC: rsnd: check rsnd_adg_clk_enable() return value ASoC: cs42l43: Add codec force suspend/resume ops ALSA: doc: Add codecs/index.rst to top-level index ALSA: doc: cs35l56: Add information about Cirrus Logic CS35L54/56/57 ASoC: samsung: Add missing depends on I2C MAINTAINERS: add missing maintainers for Simple Audio Card ASoC: samsung: Add missing selects for MFD_WM8994 ASoC: codecs: es8316: Fix HW rate calculation for 48Mhz MCLK ASoC: wm8994: Add depends on MFD core ASoC: tas2781: Fix occasional calibration failture ASoC: codecs: ES8326: Adjust ANA_MICBIAS to reduce pop noise
2025-01-14gfs2: Truncate address space when flipping GFS2_DIF_JDATA flagAndreas Gruenbacher
Truncate an inode's address space when flipping the GFS2_DIF_JDATA flag: depending on that flag, the pages in the address space will either use buffer heads or iomap_folio_state structs, and we cannot mix the two. Reported-by: Kun Hu <huk23@m.fudan.edu.cn>, Jiaji Qin <jjtan24@m.fudan.edu.cn> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2025-01-14drm/display: hdmi: Do not read EDID on disconnected connectorsCristian Ciocaltea
The recently introduced hotplug event handler in the HDMI Connector framework attempts to unconditionally read the EDID data, leading to a bunch of non-harmful, yet quite annoying DDC/I2C related errors being reported. Ensure the operation is done only for connectors having the status connected or unknown. Additionally, perform an explicit reset of the connector information when dealing with a disconnected status. Fixes: ab716b74dc9d ("drm/display/hdmi: implement hotplug functions") Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250113-hdmi-conn-edid-read-fix-v2-1-d2a0438a44ab@collabora.com Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-01-14drm/tests: hdmi: Add connector disablement testLiu Ying
Atomic check should succeed when disabling a connector. Add a test case drm_test_check_disabling_connector() to make sure of this. Suggested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Liu Ying <victor.liu@nxp.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250110084821.3239518-3-victor.liu@nxp.com Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-01-14drm/connector: hdmi: Do atomic check when necessaryLiu Ying
It's ok to pass atomic check successfully if an atomic commit tries to disable the display pipeline which the connector belongs to. That is, when the crtc or the best_encoder pointers in struct drm_connector_state are NULL, drm_atomic_helper_connector_hdmi_check() should return 0. Without the check against the NULL pointers, drm_default_rgb_quant_range() called by drm_atomic_helper_connector_hdmi_check() would dereference the NULL pointer to_match in drm_match_cea_mode(). Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Call trace: drm_default_rgb_quant_range+0x0/0x4c (P) drm_bridge_connector_atomic_check+0x20/0x2c drm_atomic_helper_check_modeset+0x488/0xc78 drm_atomic_helper_check+0x20/0xa4 drm_atomic_check_only+0x4b8/0x984 drm_atomic_commit+0x48/0xc4 drm_framebuffer_remove+0x44c/0x530 drm_mode_rmfb_work_fn+0x7c/0xa0 process_one_work+0x150/0x294 worker_thread+0x2dc/0x3dc kthread+0x130/0x204 ret_from_fork+0x10/0x20 Fixes: 8ec116ff21a9 ("drm/display: bridge_connector: provide atomic_check for HDMI bridges") Fixes: 84e541b1e58e ("drm/sun4i: use drm_atomic_helper_connector_hdmi_check()") Fixes: 65548c8ff0ab ("drm/rockchip: inno_hdmi: Switch to HDMI connector") Signed-off-by: Liu Ying <victor.liu@nxp.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250110084821.3239518-2-victor.liu@nxp.com Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-01-14Merge drm/drm-next into drm-misc-next-fixesMaxime Ripard
drm-next has the dmem cgroup patches we need to merge fixes for. Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-01-14blk-mq: Move more error handling into blk_mq_submit_bio()Bart Van Assche
The error handling code in blk_mq_get_new_requests() cannot be understood without knowing that this function is only called by blk_mq_submit_bio(). Hence move the code for handling blk_mq_get_new_requests() failures into blk_mq_submit_bio(). Cc: Damien Le Moal <dlemoal@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20241218212246.1073149-3-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-01-14block: Reorder the request allocation code in blk_mq_submit_bio()Bart Van Assche
Help the CPU branch predictor in case of a cache hit by handling the cache hit scenario first. Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20241218212246.1073149-2-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>