summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-08-27drm/amd/display: add sharpness support for windowed YUV420 videoSamson Tam
[Why] Previous only applied sharpness for fullscreen YUV420 video. [How] Remove fullscrene restriction and apply sharpness for windowed YUV420 video as well. Reviewed-by: Jun Lei <jun.lei@amd.com> Signed-off-by: Samson Tam <Samson.Tam@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-27drm/amd/display: add improvements for text display and HDR DWM and MPOSamson Tam
[Why] Tune settings for improved text display. Handle differences between DWM and MPO in HDR path. [How] Update sharpener LBA table. Use HDR multiplier to calculate scalar matrix coefficients for HDR RGB MPO path. Update unit tests. Reviewed-by: Jun Lei <jun.lei@amd.com> Signed-off-by: Samson Tam <Samson.Tam@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-27drm/amd/display: Add Replay Low Refresh Rate parameters in dc type.Dennis Chan
Why: To supported Low Refresh Rate panel for Replay Feature, Adding some parameters to record Low Refresh Rate information. Reviewed-by: Robin Chen <robin.chen@amd.com> Signed-off-by: Dennis Chan <dennis.chan@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-27drm/amd/display: add back quality EASF and ISHARP and dc dependency changesSamson Tam
[Why] Addressed previous issues with quality changes and new issues due to rolling back quality changes. [How] This reverts commit f9e6759888866748f31b6b6c2142a481d587f51f, fixes merge conflicts, and fixed some formatting errors. Store current sharpness level for each pregen table to minimize calculating sharpness table every time. Disable dynamic ODM when sharpness is enabled. Reviewed-by: Jun Lei <jun.lei@amd.com> Signed-off-by: Samson Tam <Samson.Tam@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-27drm/amd/display: Notify DMCUB of D0/D3 stateNicholas Kazlauskas
[Why] We want to avoid arming the HPD timer in firmware when preparing for S0i3 entry when DC is considered in D3. [How] Notify DMCUB of the power state transitions so it can decide to arm the HPD timer for idle in DCN35 only in D0. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Ovidiu Bunea <Ovidiu.Bunea@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-27drm/amd/display: Fix Synaptics Cascaded Panamera DSC DeterminationFangzhi Zuo
Synaptics Cascaded Panamera topology needs to unconditionally acquire root aux for dsc decoding. Reviewed-by: Roman Li <roman.li@amd.com> Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-27drm/amd/display: Retry Replay residencyChunTao Tso
[Why] Because sometime DMUB GPINT will time out, it will cause we return 0 as residency number. [How] Retry to avoid this happened. Reviewed-by: Robin Chen <robin.chen@amd.com> Signed-off-by: ChunTao Tso <ChunTao.Tso@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-27drm/amd/display: Allocate DCN35 clock table transfer buffers in GARTNicholas Kazlauskas
[Why] Request from PMFW to use GART for clock table transfer tables as framebuffer is being deprecated on APU. [How] Switch over to GART via the allocation flag. Reviewed-by: Sung joon Kim <sungjoon.kim@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-27drm/amd/display: do not set traslate_by_source for DCN401 cursorAurabindo Pillai
translate_by_source need not be set for DCN401 onwards since cursor cursor composition comes after scaler in the hardware pipeline. Hence offset calculation has been reworked, and this setting is not necessary to be enabled anymore. Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-27drm/amd/display: Resolve Coverity IssuesDaniel Sa
[WHY] Remove coverity issues that were originally ignored. [HOW] Ran coverity locally on driver, used output report to find existing coverity issues, resolved them Reviewed-by: Nicholas Choi <nicholas.choi@amd.com> Signed-off-by: Daniel Sa <Daniel.Sa@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-27drm/amd/display: Fix MS/MP mismatches in dml21 for dcn401Dillon Varone
[WHY] Prefetch calculations did not guarantee that bandwidth required in mode support was less than mode programming which can cause failures. [HOW] Fix bandwidth calculations to assume fixed times for OTO schedule, and choose which schedule to use based on time to fetch pixel data. Reviewed-by: Jun Lei <jun.lei@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-27drm/amd/display: Wait for all pending cleared before full updateAlvin Lee
[Description] Before every full update we must wait for all pending updates to be cleared - this is particularly important for minimal transitions because if we don't wait for pending cleared, it will be as if there was no minimal transition at all. In OTG we must read 3 different status registers for pending cleared, one specifically for OTG updates, one specifically for OPTC updates, and the last for surface related updates Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-27drm/amd/display: guard write a 0 post_divider value to HWAhmed, Muhammad
[why] post_divider_value should not be 0. Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Ahmed, Muhammad <Ahmed.Ahmed@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-27drm/amd/display: Don't skip clock updates in overclockingAlvin Lee
[Description] Skipping clock updates is not a hard requirement for overclocking and only an optimization. Remove the skip as this can cause issues for FAMS transitions during the overclock sequence. If FAMS is enabled we must disable UCLK switch on any full update (which requires update clocks to be called). Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-27drm/amd: Introduce additional IPS debug flagsLeo Li
[Why] Idle power states (IPS) describe levels of power-gating within DCN. DM and DC is responsible for ensuring that we are out of IPS before any DCN programming happens. Any DCN programming while we're in IPS leads to undefined behavior (mostly hangs). Because IPS intersects with all display features, the ability to disable IPS by default while ironing out the known issues is desired. However, disabing it completely will cause important features such as s0ix entry to fail. Therefore, more granular IPS debug flags are desired. [How] Extend the dc debug mask bits to include the available list of IPS debug flags. All the flags should work as documented, with the exception of IPS_DISABLE_DYNAMIC. It requires dm changes which will be done in later changes. v2: enable docs and fix docstring format Signed-off-by: Leo Li <sunpeng.li@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-27drm/amdgpu/smu13.0.7: print index for profilesAlex Deucher
Print the index for the profiles. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3543 Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-27drm/amdgpu/swsmu: fix ordering for setting workload_maskAlex Deucher
No change in functionality for the current code, but we need to set the index properly before changing it if we ever use a non-0 index. Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-27drm/amdgpu: align pp_power_profile_mode with kernel docsAlex Deucher
The kernel doc says you need to select manual mode to adjust this, but the code only allows you to adjust it when manual mode is not selected. Remove the manual mode check. Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-27Merge branch 'mptcp-close-subflow-when-receiving-tcp-fin-and-misc'Jakub Kicinski
Matthieu Baerts says: ==================== mptcp: close subflow when receiving TCP+FIN and misc. Here are different fixes: Patch 1 closes the subflow after having received a FIN, instead of leaving it half-closed until the end of the MPTCP connection. A fix for v5.12. Patch 2 validates the previous patch. Patch 3 is a fix for a recent fix to check both directions for the backup flag. It can follow the 'Fixes' commit and be backported up to v5.7. Patch 4 adds a missing \n at the end of pr_debug(), causing debug messages to be displayed with a delay, which confuses the debugger. A fix for v5.6. ==================== Link: https://patch.msgid.link/20240826-net-mptcp-close-extra-sf-fin-v1-0-905199fe1172@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27mptcp: pr_debug: add missing \n at the endMatthieu Baerts (NGI0)
pr_debug() have been added in various places in MPTCP code to help developers to debug some situations. With the dynamic debug feature, it is easy to enable all or some of them, and asks users to reproduce issues with extra debug. Many of these pr_debug() don't end with a new line, while no 'pr_cont()' are used in MPTCP code. So the goal was not to display multiple debug messages on one line: they were then not missing the '\n' on purpose. Not having the new line at the end causes these messages to be printed with a delay, when something else needs to be printed. This issue is not visible when many messages need to be printed, but it is annoying and confusing when only specific messages are expected, e.g. # echo "func mptcp_pm_add_addr_echoed +fmp" \ > /sys/kernel/debug/dynamic_debug/control # ./mptcp_join.sh "signal address"; \ echo "$(awk '{print $1}' /proc/uptime) - end"; \ sleep 5s; \ echo "$(awk '{print $1}' /proc/uptime) - restart"; \ ./mptcp_join.sh "signal address" 013 signal address (...) 10.75 - end 15.76 - restart 013 signal address [ 10.367935] mptcp:mptcp_pm_add_addr_echoed: MPTCP: msk=(...) (...) => a delay of 5 seconds: printed with a 10.36 ts, but after 'restart' which was printed at the 15.76 ts. The 'Fixes' tag here below points to the first pr_debug() used without '\n' in net/mptcp. This patch could be split in many small ones, with different Fixes tag, but it doesn't seem worth it, because it is easy to re-generate this patch with this simple 'sed' command: git grep -l pr_debug -- net/mptcp | xargs sed -i "s/\(pr_debug(\".*[^n]\)\(\"[,)]\)/\1\\\n\2/g" So in case of conflicts, simply drop the modifications, and launch this command. Fixes: f870fa0b5768 ("mptcp: Add MPTCP socket stubs") Cc: stable@vger.kernel.org Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240826-net-mptcp-close-extra-sf-fin-v1-4-905199fe1172@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27mptcp: sched: check both backup in retransMatthieu Baerts (NGI0)
The 'mptcp_subflow_context' structure has two items related to the backup flags: - 'backup': the subflow has been marked as backup by the other peer - 'request_bkup': the backup flag has been set by the host Looking only at the 'backup' flag can make sense in some cases, but it is not the behaviour of the default packet scheduler when selecting paths. As explained in the commit b6a66e521a20 ("mptcp: sched: check both directions for backup"), the packet scheduler should look at both flags, because that was the behaviour from the beginning: the 'backup' flag was set by accident instead of the 'request_bkup' one. Now that the latter has been fixed, get_retrans() needs to be adapted as well. Fixes: b6a66e521a20 ("mptcp: sched: check both directions for backup") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240826-net-mptcp-close-extra-sf-fin-v1-3-905199fe1172@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27selftests: mptcp: join: cannot rm sf if closedMatthieu Baerts (NGI0)
Thanks to the previous commit, the MPTCP subflows are now closed on both directions even when only the MPTCP path-manager of one peer asks for their closure. In the two tests modified here -- "userspace pm add & remove address" and "userspace pm create destroy subflow" -- one peer is controlled by the userspace PM, and the other one by the in-kernel PM. When the userspace PM sends a RM_ADDR notification, the in-kernel PM will automatically react by closing all subflows using this address. Now, thanks to the previous commit, the subflows are properly closed on both directions, the userspace PM can then no longer closes the same subflows if they are already closed. Before, it was OK to do that, because the subflows were still half-opened, still OK to send a RM_ADDR. In other words, thanks to the previous commit closing the subflows, an error will be returned to the userspace if it tries to close a subflow that has already been closed. So no need to run this command, which mean that the linked counters will then not be incremented. These tests are then no longer sending both a RM_ADDR, then closing the linked subflow just after. The test with the userspace PM on the server side is now removing one subflow linked to one address, then sending a RM_ADDR for another address. The test with the userspace PM on the client side is now only removing the subflow that was previously created. Fixes: 4369c198e599 ("selftests: mptcp: test userspace pm out of transfer") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240826-net-mptcp-close-extra-sf-fin-v1-2-905199fe1172@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27mptcp: close subflow when receiving TCP+FINMatthieu Baerts (NGI0)
When a peer decides to close one subflow in the middle of a connection having multiple subflows, the receiver of the first FIN should accept that, and close the subflow on its side as well. If not, the subflow will stay half closed, and would even continue to be used until the end of the MPTCP connection or a reset from the network. The issue has not been seen before, probably because the in-kernel path-manager always sends a RM_ADDR before closing the subflow. Upon the reception of this RM_ADDR, the other peer will initiate the closure on its side as well. On the other hand, if the RM_ADDR is lost, or if the path-manager of the other peer only closes the subflow without sending a RM_ADDR, the subflow would switch to TCP_CLOSE_WAIT, but that's it, leaving the subflow half-closed. So now, when the subflow switches to the TCP_CLOSE_WAIT state, and if the MPTCP connection has not been closed before with a DATA_FIN, the kernel owning the subflow schedules its worker to initiate the closure on its side as well. This issue can be easily reproduced with packetdrill, as visible in [1], by creating an additional subflow, injecting a FIN+ACK before sending the DATA_FIN, and expecting a FIN+ACK in return. Fixes: 40947e13997a ("mptcp: schedule worker when subflow is closed") Cc: stable@vger.kernel.org Link: https://github.com/multipath-tcp/packetdrill/pull/154 [1] Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240826-net-mptcp-close-extra-sf-fin-v1-1-905199fe1172@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27tcp: fix forever orphan socket caused by tcp_abortXueming Feng
We have some problem closing zero-window fin-wait-1 tcp sockets in our environment. This patch come from the investigation. Previously tcp_abort only sends out reset and calls tcp_done when the socket is not SOCK_DEAD, aka orphan. For orphan socket, it will only purging the write queue, but not close the socket and left it to the timer. While purging the write queue, tp->packets_out and sk->sk_write_queue is cleared along the way. However tcp_retransmit_timer have early return based on !tp->packets_out and tcp_probe_timer have early return based on !sk->sk_write_queue. This caused ICSK_TIME_RETRANS and ICSK_TIME_PROBE0 not being resched and socket not being killed by the timers, converting a zero-windowed orphan into a forever orphan. This patch removes the SOCK_DEAD check in tcp_abort, making it send reset to peer and close the socket accordingly. Preventing the timer-less orphan from happening. According to Lorenzo's email in the v1 thread, the check was there to prevent force-closing the same socket twice. That situation is handled by testing for TCP_CLOSE inside lock, and returning -ENOENT if it is already closed. The -ENOENT code comes from the associate patch Lorenzo made for iproute2-ss; link attached below, which also conform to RFC 9293. At the end of the patch, tcp_write_queue_purge(sk) is removed because it was already called in tcp_done_with_error(). p.s. This is the same patch with v2. Resent due to mis-labeled "changes requested" on patchwork.kernel.org. Link: https://patchwork.ozlabs.org/project/netdev/patch/1450773094-7978-3-git-send-email-lorenzo@google.com/ Fixes: c1e64e298b8c ("net: diag: Support destroying TCP sockets.") Signed-off-by: Xueming Feng <kuro@kuroa.me> Tested-by: Lorenzo Colitti <lorenzo@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20240826102327.1461482-1-kuro@kuroa.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27gtp: fix a potential NULL pointer dereferenceCong Wang
When sockfd_lookup() fails, gtp_encap_enable_socket() returns a NULL pointer, but its callers only check for error pointers thus miss the NULL pointer case. Fix it by returning an error pointer with the error code carried from sockfd_lookup(). (I found this bug during code inspection.) Fixes: 1e3a3abd8b28 ("gtp: make GTP sockets in gtp_newlink optional") Cc: Andreas Schultz <aschultz@tpip.net> Cc: Harald Welte <laforge@gnumonks.org> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org> Link: https://patch.msgid.link/20240825191638.146748-1-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27Merge drm/drm-next into drm-intel-nextRodrigo Vivi
Need to take some Xe bo definition in here before we can add the BMG display 64k aligned size restrictions. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-27rust: allow `stable_features` lintMiguel Ojeda
Support for several Rust compiler versions started in commit 63b27f4a0074 ("rust: start supporting several compiler versions"). Since we currently need to use a number of unstable features in the kernel, it is a matter of time until one gets stabilized and the `stable_features` lint warns. For instance, the `new_uninit` feature may become stable soon, which would give us multiple warnings like the following: warning: the feature `new_uninit` has been stable since 1.82.0-dev and no longer requires an attribute to enable --> rust/kernel/lib.rs:17:12 | 17 | #![feature(new_uninit)] | ^^^^^^^^^^ | = note: `#[warn(stable_features)]` on by default Thus allow the `stable_features` lint to avoid such warnings. This is the simplest approach -- we do not have that many cases (and the goal is to stop using unstable features anyway) and cleanups can be easily done when we decide to update the minimum version. An alternative would be to conditionally enable them based on the compiler version (with the upcoming `RUSTC_VERSION` or maybe with the unstable `cfg(version(...))`, but that one apparently will not work for the nightly case). However, doing so is more complex and may not work well for different nightlies of the same version, unless we do not care about older nightlies. Another alternative is using explicit tests of the feature calling `rustc`, but that is also more complex and slower. Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/20240827100403.376389-1-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-08-27docs: rust: remove unintended blockquote in Quick StartJon Mulder
Remove indentation within the "Hacking" section of the Rust Quick Start guide, i.e. remove a `<blockquote>` HTML element from the rendered documentation. Reported-by: Miguel Ojeda <ojeda@kernel.org> Closes: https://github.com/Rust-for-Linux/linux/issues/1103 Fixes: d07479b211b7 ("docs: add Rust documentation") Signed-off-by: Jon Mulder <jon.e.mulder@gmail.com> Link: https://lore.kernel.org/r/20240826-pr-docs-rust-remove-quickstart-blockquote-v1-1-c51317d8d71a@gmail.com [ Added Fixes tag, reworded slightly and matched title to a previous, similar commit. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-08-27Merge branch 'fixes-for-ipsec-over-bonding'Jakub Kicinski
Jianbo Liu says: ==================== Fixes for IPsec over bonding This patchset provides bug fixes for IPsec over bonding driver. It adds the missing xdo_dev_state_free API, and fixes "scheduling while atomic" by using mutex lock instead. Series generated against: commit c07ff8592d57 ("netem: fix return value if duplicate enqueue fails") ==================== Link: https://patch.msgid.link/20240823031056.110999-1-jianbol@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27bonding: change ipsec_lock from spin lock to mutexJianbo Liu
In the cited commit, bond->ipsec_lock is added to protect ipsec_list, hence xdo_dev_state_add and xdo_dev_state_delete are called inside this lock. As ipsec_lock is a spin lock and such xfrmdev ops may sleep, "scheduling while atomic" will be triggered when changing bond's active slave. [ 101.055189] BUG: scheduling while atomic: bash/902/0x00000200 [ 101.055726] Modules linked in: [ 101.058211] CPU: 3 PID: 902 Comm: bash Not tainted 6.9.0-rc4+ #1 [ 101.058760] Hardware name: [ 101.059434] Call Trace: [ 101.059436] <TASK> [ 101.060873] dump_stack_lvl+0x51/0x60 [ 101.061275] __schedule_bug+0x4e/0x60 [ 101.061682] __schedule+0x612/0x7c0 [ 101.062078] ? __mod_timer+0x25c/0x370 [ 101.062486] schedule+0x25/0xd0 [ 101.062845] schedule_timeout+0x77/0xf0 [ 101.063265] ? asm_common_interrupt+0x22/0x40 [ 101.063724] ? __bpf_trace_itimer_state+0x10/0x10 [ 101.064215] __wait_for_common+0x87/0x190 [ 101.064648] ? usleep_range_state+0x90/0x90 [ 101.065091] cmd_exec+0x437/0xb20 [mlx5_core] [ 101.065569] mlx5_cmd_do+0x1e/0x40 [mlx5_core] [ 101.066051] mlx5_cmd_exec+0x18/0x30 [mlx5_core] [ 101.066552] mlx5_crypto_create_dek_key+0xea/0x120 [mlx5_core] [ 101.067163] ? bonding_sysfs_store_option+0x4d/0x80 [bonding] [ 101.067738] ? kmalloc_trace+0x4d/0x350 [ 101.068156] mlx5_ipsec_create_sa_ctx+0x33/0x100 [mlx5_core] [ 101.068747] mlx5e_xfrm_add_state+0x47b/0xaa0 [mlx5_core] [ 101.069312] bond_change_active_slave+0x392/0x900 [bonding] [ 101.069868] bond_option_active_slave_set+0x1c2/0x240 [bonding] [ 101.070454] __bond_opt_set+0xa6/0x430 [bonding] [ 101.070935] __bond_opt_set_notify+0x2f/0x90 [bonding] [ 101.071453] bond_opt_tryset_rtnl+0x72/0xb0 [bonding] [ 101.071965] bonding_sysfs_store_option+0x4d/0x80 [bonding] [ 101.072567] kernfs_fop_write_iter+0x10c/0x1a0 [ 101.073033] vfs_write+0x2d8/0x400 [ 101.073416] ? alloc_fd+0x48/0x180 [ 101.073798] ksys_write+0x5f/0xe0 [ 101.074175] do_syscall_64+0x52/0x110 [ 101.074576] entry_SYSCALL_64_after_hwframe+0x4b/0x53 As bond_ipsec_add_sa_all and bond_ipsec_del_sa_all are only called from bond_change_active_slave, which requires holding the RTNL lock. And bond_ipsec_add_sa and bond_ipsec_del_sa are xfrm state xdo_dev_state_add and xdo_dev_state_delete APIs, which are in user context. So ipsec_lock doesn't have to be spin lock, change it to mutex, and thus the above issue can be resolved. Fixes: 9a5605505d9c ("bonding: Add struct bond_ipesc to manage SA") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Jay Vosburgh <jv@jvosburgh.net> Link: https://patch.msgid.link/20240823031056.110999-4-jianbol@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27bonding: extract the use of real_device into local variableJianbo Liu
Add a local variable for slave->dev, to prepare for the lock change in the next patch. There is no functionality change. Fixes: 9a5605505d9c ("bonding: Add struct bond_ipesc to manage SA") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Jay Vosburgh <jv@jvosburgh.net> Link: https://patch.msgid.link/20240823031056.110999-3-jianbol@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27bonding: implement xdo_dev_state_free and call it after deletionJianbo Liu
Add this implementation for bonding, so hardware resources can be freed from the active slave after xfrm state is deleted. The netdev used to invoke xdo_dev_state_free callback, is saved in the xfrm state (xs->xso.real_dev), which is also the bond's active slave. To prevent it from being freed, acquire netdev reference before leaving RCU read-side critical section, and release it after callback is done. And call it when deleting all SAs from old active real interface while switching current active slave. Fixes: 9a5605505d9c ("bonding: Add struct bond_ipesc to manage SA") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Jay Vosburgh <jv@jvosburgh.net> Link: https://patch.msgid.link/20240823031056.110999-2-jianbol@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27selftests: forwarding: local_termination: Down ports on cleanupPetr Machata
This test neglects to put ports down on cleanup. Fix it. Fixes: 90b9566aa5cd ("selftests: forwarding: add a test for local_termination.sh") Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/bf9b79f45de378f88344d44550f0a5052b386199.1724692132.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27selftests: forwarding: no_forwarding: Down ports on cleanupPetr Machata
This test neglects to put ports down on cleanup. Fix it. Fixes: 476a4f05d9b8 ("selftests: forwarding: add a no_forwarding.sh test") Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/0baf91dc24b95ae0cadfdf5db05b74888e6a228a.1724430120.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27drm/xe: Support 'nomodeset' kernel command-line optionThomas Zimmermann
Setting 'nomodeset' on the kernel command line disables all graphics drivers with modesetting capabilities, leaving only firmware drivers, such as simpledrm or efifb. Most DRM drivers automatically support 'nomodeset' via DRM's module helper macros. In xe, which uses regular module_init(), manually call drm_firmware_drivers_only() to test for 'nomodeset'. Do not register the driver if set. v2: - use xe's init table (Lucas) - do NULL test for init/exit functions Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240827121003.97429-1-tzimmermann@suse.de Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-08-27Revert "drm/panel-edp: Add SDC ATNA45AF01"Stephan Gerhold
This reverts commit 8ebb1fc2e69ab8b89a425e402c7bd85e053b7b01. The panel should be handled through the samsung-atna33xc20 driver for correct power up timings. Otherwise the backlight does not work correctly. We have existing users of this panel through the generic "edp-panel" compatible (e.g. the Qualcomm X1E80100 CRD), but the screen works only partially in that configuration: It works after boot but once the screen gets disabled it does not turn on again until after reboot. It behaves the same way with the default "conservative" timings, so we might as well drop the configuration from the panel-edp driver. That way, users with old DTBs will get a warning and can move to the new driver. Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Stephan Gerhold <stephan.gerhold@linaro.org> Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Tested-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240715-x1e80100-crd-backlight-v2-2-31b7f2f658a3@linaro.org
2024-08-27nvme-pci: Add sleep quirk for Samsung 990 EvoGeorg Gottleuber
On some TUXEDO platforms, a Samsung 990 Evo NVMe leads to a high power consumption in s2idle sleep (2-3 watts). This patch applies 'Force No Simple Suspend' quirk to achieve a sleep with a lower power consumption, typically around 0.5 watts. Signed-off-by: Georg Gottleuber <ggo@tuxedocomputers.com> Signed-off-by: Werner Sembach <wse@tuxedocomputers.com> Cc: <stable@vger.kernel.org> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-08-27Merge tag 'amd-pstate-v6.11-2024-08-26' of ↵Rafael J. Wysocki
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux Merge amd-pstate driver fixes for 6.11-rc6 from Mario Limonciello: "amd-pstate fixes for 6.11-rc - Fix to unit test coverage - Fix bug with enabling CPPC on hetero designs - Fix uninitialized variable" * tag 'amd-pstate-v6.11-2024-08-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux: cpufreq/amd-pstate-ut: Don't check for highest perf matching on prefcore cpufreq/amd-pstate: Use topology_logical_package_id() instead of logical_die_id() cpufreq: amd-pstate: Fix uninitialized variable in amd_pstate_cpu_boost_update()
2024-08-28Merge tag 'livepatching-for-6.11-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching Pull livepatching fix from Petr Mladek: "Selftest regression fix" * tag 'livepatching-for-6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching: selftests/livepatch: wait for atomic replace to occur
2024-08-28Merge tag 'pinctrl-v6.11-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: - Fix the hwirq map and pin offsets in the Qualcomm X1E80100 driver - Fix the pin range handling in the AT91 driver so it works again - Fix a NULL-dereference risk in pinctrl single - Fix a serious biasing bug in the Mediatek driver - Fix the level trigged IRQ in the StarFive JH7110 - Fix the iomux width in the Rockchip GPIO2-B pin handling * tag 'pinctrl-v6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: rockchip: correct RK3328 iomux width flag for GPIO2-B pins pinctrl: starfive: jh7110: Correct the level trigger configuration of iev register pinctrl: qcom: x1e80100: Fix special pin offsets pinctrl: mediatek: common-v2: Fix broken bias-disable for PULL_PU_PD_RSEL_TYPE pinctrl: single: fix potential NULL dereference in pcs_get_function() pinctrl: at91: make it work with current gpiolib pinctrl: qcom: x1e80100: Update PDC hwirq map
2024-08-28Merge tag 'sound-6.11-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "It became a bit larger collection of fixes than wished at this time, but all changes are small and mostly device-specific fixes that should be fairly safe to apply. Majority of fixes are about ASoC for AMD SOF, Cirrus codecs, lpass, etc, in addition to the usual HD-audio quirks / fixes" * tag 'sound-6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (22 commits) ALSA: hda: hda_component: Fix mutex crash if nothing ever binds ALSA: hda/realtek: support HP Pavilion Aero 13-bg0xxx Mute LED ALSA: hda/realtek: Fix the speaker output on Samsung Galaxy Book3 Ultra ASoC: cs-amp-lib: Ignore empty UEFI calibration entries ASoC: cs-amp-lib-test: Force test calibration blob entries to be valid ALSA: hda/realtek - FIxed ALC285 headphone no sound ALSA: hda/realtek - Fixed ALC256 headphone no sound ASoC: allow module autoloading for table board_ids ASoC: allow module autoloading for table db1200_pids ALSA: hda: cs35l56: Don't use the device index as a calibration index ALSA: seq: Skip event type filtering for UMP events ALSA: hda/realtek: Enable mute/micmute LEDs on HP Laptop 14-ey0xxx ASoC: SOF: amd: Fix for acp init sequence ASoC: amd: acp: fix module autoloading ASoC: mediatek: mt8188: Mark AFE_DAC_CON0 register as volatile ASoC: codecs: wcd937x: Fix missing de-assert of reset GPIO ASoC: SOF: mediatek: Add missing board compatible ASoC: MAINTAINERS: Drop Banajit Goswami from Qualcomm sound drivers ASoC: SOF: amd: Fix for incorrect acp error register offsets ASoC: SOF: amd: move iram-dram fence register programming sequence ...
2024-08-27tpm: ibmvtpm: Call tpm2_sessions_init() to initialize session supportStefan Berger
Commit d2add27cf2b8 ("tpm: Add NULL primary creation") introduced CONFIG_TCG_TPM2_HMAC. When this option is enabled on ppc64 then the following message appears in the kernel log due to a missing call to tpm2_sessions_init(). [ 2.654549] tpm tpm0: auth session is not active Add the missing call to tpm2_session_init() to the ibmvtpm driver to resolve this issue. Cc: stable@vger.kernel.org # v6.10+ Fixes: d2add27cf2b8 ("tpm: Add NULL primary creation") Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-08-27clk: qcom: gcc-x1e80100: Don't use parking clk_ops for QUPsBryan O'Donoghue
Per Stephen Boyd's explanation in the link below, QUP RCG clocks do not need to be parked when switching frequency. A side-effect in parking to a lower frequency can be a momentary invalid clock driven on an in-use serial peripheral. This can cause "junk" to spewed out of a UART as a low-impact example. On the x1e80100-crd this serial port junk can be observed on linux-next. Apply a similar fix to the x1e80100 Global Clock controller to remediate. Link: https://lore.kernel.org/all/20240819233628.2074654-3-swboyd@chromium.org/ Fixes: 161b7c401f4b ("clk: qcom: Add Global Clock controller (GCC) driver for X1E80100") Fixes: 929c75d57566 ("clk: qcom: gcc-sm8550: Mark RCGs shared where applicable") Suggested-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Link: https://lore.kernel.org/r/20240823-x1e80100-clk-fix-v1-1-0b1b4f5a96e8@linaro.org Reviewed-by: Konrad Dybcio <konradybcio@kernel.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2024-08-27Merge tag 'qcom-clk-fixes-for-6.11' of ↵Stephen Boyd
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into clk-fixes Pull Qualcomm clk driver fixes from Bjorn Andersson: This corrects several issues with the Alpha PLL clock driver. It updates IPQ9574 GCC driver to correctly use the EVO PLL registers for GPLL clocks. X1E USB GDSC flags are corrected to leave these in retention as the controllers are suspended. * tag 'qcom-clk-fixes-for-6.11' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: clk: qcom: ipq9574: Update the alpha PLL type for GPLLs clk: qcom: gcc-x1e80100: Fix USB 0 and 1 PHY GDSC pwrsts flags clk: qcom: clk-alpha-pll: Update set_rate for Zonda PLL clk: qcom: clk-alpha-pll: Fix zonda set_rate failure when PLL is disabled clk: qcom: clk-alpha-pll: Fix the trion pll postdiv set rate API clk: qcom: clk-alpha-pll: Fix the pll post div mask
2024-08-27netfilter: nf_tables_ipv6: consider network offset in netdev/egress validationPablo Neira Ayuso
From netdev/egress, skb->len can include the ethernet header, therefore, subtract network offset from skb->len when validating IPv6 packet length. Fixes: 42df6e1d221d ("netfilter: Introduce egress hook") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-08-27net_sched: sch_fq: fix incorrect behavior for small weightsEric Dumazet
fq_dequeue() has a complex logic to find packets in one of the 3 bands. As Neal found out, it is possible that one band has a deficit smaller than its weight. fq_dequeue() can return NULL while some packets are elligible for immediate transmit. In this case, more than one iteration is needed to refill pband->credit. With default parameters (weights 589824 196608 65536) bug can trigger if large BIG TCP packets are sent to the lowest priority band. Bisected-by: John Sperbeck <jsperbeck@google.com> Diagnosed-by: Neal Cardwell <ncardwell@google.com> Fixes: 29f834aa326e ("net_sched: sch_fq: add 3 bands and WRR scheduling") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Link: https://patch.msgid.link/20240824181901.953776-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27drm/i915: Do not attempt to load the GSC multiple timesDaniele Ceraolo Spurio
If the GSC FW fails to load the GSC HW hangs permanently; the only ways to recover it are FLR or D3cold entry, with the former only being supported on driver unload and the latter only on DGFX, for which we don't need to load the GSC. Therefore, if GSC fails to load there is no need to try again because the HW is stuck in the error state and the submission to load the FW would just hang the GSCCS. Note that, due to wa_14015076503, on MTL the GuC escalates all GSCCS hangs to full GT resets, which would trigger a new attempt to load the GSC FW in the post-reset HW re-init; this issue is also fixed by not attempting to load the GSC FW after an error. Fixes: 15bd4a67e914 ("drm/i915/gsc: GSC firmware loading") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: <stable@vger.kernel.org> # v6.3+ Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240820215952.2290807-1-daniele.ceraolospurio@intel.com
2024-08-27nvme-pci: allocate tagset on reset if necessaryKeith Busch
If a drive is unable to create IO queues on the initial probe, a subsequent reset will need to allocate the tagset if IO queue creation is successful. Without this, blk_mq_update_nr_hw_queues will crash on a bad pointer due to the invalid tagset. Fixes: eac3ef262941f62 ("nvme-pci: split the initial probe from the rest path") Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-08-27btrfs: fix uninitialized return value from btrfs_reclaim_sweep()Filipe Manana
The return variable 'ret' at btrfs_reclaim_sweep() is never assigned if none of the space infos is reclaimable (for example if periodic reclaim is disabled, which is the default), so we return an undefined value. This can be fixed my making btrfs_reclaim_sweep() not return any value as well as do_reclaim_sweep() because: 1) do_reclaim_sweep() always returns 0, so we can make it return void; 2) The only caller of btrfs_reclaim_sweep() (btrfs_reclaim_bgs()) doesn't care about its return value, and in its context there's nothing to do about any errors anyway. Therefore remove the return value from btrfs_reclaim_sweep() and do_reclaim_sweep(). Fixes: e4ca3932ae90 ("btrfs: periodic block_group reclaim") Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-08-27drm/xe: Invalidate media_gt TLBsMatthew Brost
Testing on LNL has shown media TLBs need to be invalidated via the GuC, update xe_vm_invalidate_vma appropriately. v2: Fix 2 tile case v3: Include missing local change Fixes: 3330361543fc ("drm/xe/lnl: Add LNL platform definition") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240820160129.986889-1-matthew.brost@intel.com (cherry picked from commit 77cc3f6c58b1b28cee73904946c46a1415187d04) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>