summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-10-04drm/amd: Fix logic error in sienna_cichlid_update_pcie_parameters()Mario Limonciello
While aligning SMU11 with SMU13 implementation an assumption was made that `dpm_context->dpm_tables.pcie_table` was populated in dpm table initialization like in SMU13 but it isn't. So restore some of the original logic and instead just check for amdgpu_device_pcie_dynamic_switching_supported() to decide whether to hardcode values; erring on the side of performance. Cc: stable@vger.kernel.org # 6.1+ Reported-and-tested-by: Umio Yasuno <coelacanth_dream@protonmail.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1447#note_2101382 Fixes: e701156ccc6c ("drm/amd: Align SMU11 SMU_MSG_OverridePcieParameters implementation with SMU13") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-04drm/amdgpu: Fix a memory leakLuben Tuikov
Fix a memory leak in amdgpu_fru_get_product_info(). Cc: Alex Deucher <Alexander.Deucher@amd.com> Reported-by: Yang Wang <kevinyang.wang@amd.com> Fixes: 0dbf2c562625 ("drm/amdgpu: Interpret IPMI data for product information (v2)") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-04drm/amd/pm: add unique_id for gc 11.0.3Kenneth Feng
add unique_id for gc 11.0.3 Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-04ksmbd: fix race condition from parallel smb2 lock requestsNamjae Jeon
There is a race condition issue between parallel smb2 lock request. Time + Thread A | Thread A smb2_lock | smb2_lock | insert smb_lock to lock_list | spin_unlock(&work->conn->llist_lock) | | | spin_lock(&conn->llist_lock); | kfree(cmp_lock); | // UAF! | list_add(&smb_lock->llist, &rollback_list) + This patch swaps the line for adding the smb lock to the rollback list and adding the lock list of connection to fix the race issue. Reported-by: luosili <rootlab@huawei.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-10-04ksmbd: fix race condition from parallel smb2 logoff requestsNamjae Jeon
If parallel smb2 logoff requests come in before closing door, running request count becomes more than 1 even though connection status is set to KSMBD_SESS_NEED_RECONNECT. It can't get condition true, and sleep forever. This patch fix race condition problem by returning error if connection status was already set to KSMBD_SESS_NEED_RECONNECT. Reported-by: luosili <rootlab@huawei.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-10-04ksmbd: fix uaf in smb20_oplock_break_ackluosili
drop reference after use opinfo. Signed-off-by: luosili <rootlab@huawei.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-10-04ksmbd: fix race condition with fpNamjae Jeon
fp can used in each command. If smb2_close command is coming at the same time, UAF issue can happen by race condition. Time + Thread A | Thread B1 B2 .... B5 smb2_open | smb2_close | __open_id | insert fp to file_table | | | atomic_dec_and_test(&fp->refcount) | if fp->refcount == 0, free fp by kfree. // UAF! | use fp | + This patch add f_state not to use freed fp is used and not to free fp in use. Reported-by: luosili <rootlab@huawei.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-10-04ksmbd: fix race condition between session lookup and expireNamjae Jeon
Thread A + Thread B ksmbd_session_lookup | smb2_sess_setup sess = xa_load | | | xa_erase(&conn->sessions, sess->id); | | ksmbd_session_destroy(sess) --> kfree(sess) | // UAF! | sess->last_active = jiffies | + This patch add rwsem to fix race condition between ksmbd_session_lookup and ksmbd_expire_session. Reported-by: luosili <rootlab@huawei.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-10-04Merge tag 'rtla-v6.6-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/bristot/linux Pull rtla fixes from Daniel Bristot de Oliveira: "rtla (Real-Time Linux Analysis) tool fixes. Timerlat auto-analysis: - Timerlat is reporting thread interference time without thread noise events occurrence. It was caused because the thread interference variable was not reset after the analysis of a timerlat activation that did not hit the threshold. - The IRQ handler delay is estimated from the delta of the IRQ latency reported by timerlat, and the timestamp from IRQ handler start event. If the delta is near-zero, the drift from the external clock and the trace event and/or the overhead can cause the value to be negative. If the value is negative, print a zero-delay. - IRQ handlers happening after the timerlat thread event but before the stop tracing were being reported as IRQ that happened before the *current* IRQ occurrence. Ignore Previous IRQ noise in this condition because they are valid only for the *next* timerlat activation. Timerlat user-space: - Timerlat is stopping all user-space thread if a CPU becomes offline. Do not stop the entire tool if a CPU is/become offline, but only the thread of the unavailable CPU. Stop the tool only, if all threads leave because the CPUs become/are offline. man-pages: - Fix command line example in timerlat hist man page" * tag 'rtla-v6.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bristot/linux: rtla: fix a example in rtla-timerlat-hist.rst rtla/timerlat: Do not stop user-space if a cpu is offline rtla/timerlat_aa: Fix previous IRQ delay for IRQs that happens after thread sample rtla/timerlat_aa: Fix negative IRQ delay rtla/timerlat_aa: Zero thread sum after every sample analysis
2023-10-04Merge branch 'ynl-makefile-cleanup'Jakub Kicinski
Jakub Kicinski says: ==================== ynl Makefile cleanup While catching up on recent changes I noticed unexpected changes to Makefiles in YNL. Indeed they were not working as intended but the fixes put in place were not what I had in mind :) ==================== Link: https://lore.kernel.org/r/20231003153416.2479808-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04tools: ynl: use uAPI include magic for samplesJakub Kicinski
Makefile.deps provides direct includes in CFLAGS_$(obj). We just need to rewrite the rules to make use of the extra flags, no need to hard-include all of tools/include/uapi. Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20231003153416.2479808-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04tools: ynl: don't regen on every makeJakub Kicinski
As far as I can tell the normal Makefile dependency tracking works, generated files get re-generated if the YAML was updated. Let make do its job, don't force the re-generation. make hardclean can be used to force regeneration. Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20231003153416.2479808-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04ynl: netdev: drop unnecessary enum-as-flagsJakub Kicinski
enum-as-flags can be used when enum declares bit positions but we want to carry bitmask in an attribute. If the definition is already provided as flags there's no need to indicate the flag-iness of the attribute. Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20231003153416.2479808-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04netlink: annotate data-races around sk->sk_errEric Dumazet
syzbot caught another data-race in netlink when setting sk->sk_err. Annotate all of them for good measure. BUG: KCSAN: data-race in netlink_recvmsg / netlink_recvmsg write to 0xffff8881613bb220 of 4 bytes by task 28147 on cpu 0: netlink_recvmsg+0x448/0x780 net/netlink/af_netlink.c:1994 sock_recvmsg_nosec net/socket.c:1027 [inline] sock_recvmsg net/socket.c:1049 [inline] __sys_recvfrom+0x1f4/0x2e0 net/socket.c:2229 __do_sys_recvfrom net/socket.c:2247 [inline] __se_sys_recvfrom net/socket.c:2243 [inline] __x64_sys_recvfrom+0x78/0x90 net/socket.c:2243 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd write to 0xffff8881613bb220 of 4 bytes by task 28146 on cpu 1: netlink_recvmsg+0x448/0x780 net/netlink/af_netlink.c:1994 sock_recvmsg_nosec net/socket.c:1027 [inline] sock_recvmsg net/socket.c:1049 [inline] __sys_recvfrom+0x1f4/0x2e0 net/socket.c:2229 __do_sys_recvfrom net/socket.c:2247 [inline] __se_sys_recvfrom net/socket.c:2243 [inline] __x64_sys_recvfrom+0x78/0x90 net/socket.c:2243 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd value changed: 0x00000000 -> 0x00000016 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 28146 Comm: syz-executor.0 Not tainted 6.6.0-rc3-syzkaller-00055-g9ed22ae6be81 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/06/2023 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20231003183455.3410550-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04net: skb_queue_purge_reason() optimizationsEric Dumazet
1) Exit early if the list is empty. 2) splice the list into a local list, so that we block hard irqs only once. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20231003181920.3280453-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04sctp: update hb timer immediately after users change hb_intervalXin Long
Currently, when hb_interval is changed by users, it won't take effect until the next expiry of hb timer. As the default value is 30s, users have to wait up to 30s to wait its hb_interval update to work. This becomes pretty bad in containers where a much smaller value is usually set on hb_interval. This patch improves it by resetting the hb timer immediately once the value of hb_interval is updated by users. Note that we don't address the already existing 'problem' when sending a heartbeat 'on demand' if one hb has just been sent(from the timer) mentioned in: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg590224.html Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Link: https://lore.kernel.org/r/75465785f8ee5df2fb3acdca9b8fafdc18984098.1696172660.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04sctp: update transport state when processing a dupcook packetXin Long
During the 4-way handshake, the transport's state is set to ACTIVE in sctp_process_init() when processing INIT_ACK chunk on client or COOKIE_ECHO chunk on server. In the collision scenario below: 192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 3922216408] 192.168.1.1 > 192.168.1.2: sctp (1) [INIT] [init tag: 144230885] 192.168.1.2 > 192.168.1.1: sctp (1) [INIT ACK] [init tag: 3922216408] 192.168.1.1 > 192.168.1.2: sctp (1) [COOKIE ECHO] 192.168.1.2 > 192.168.1.1: sctp (1) [COOKIE ACK] 192.168.1.1 > 192.168.1.2: sctp (1) [INIT ACK] [init tag: 3914796021] when processing COOKIE_ECHO on 192.168.1.2, as it's in COOKIE_WAIT state, sctp_sf_do_dupcook_b() is called by sctp_sf_do_5_2_4_dupcook() where it creates a new association and sets its transport to ACTIVE then updates to the old association in sctp_assoc_update(). However, in sctp_assoc_update(), it will skip the transport update if it finds a transport with the same ipaddr already existing in the old asoc, and this causes the old asoc's transport state not to move to ACTIVE after the handshake. This means if DATA retransmission happens at this moment, it won't be able to enter PF state because of the check 'transport->state == SCTP_ACTIVE' in sctp_do_8_2_transport_strike(). This patch fixes it by updating the transport in sctp_assoc_update() with sctp_assoc_add_peer() where it updates the transport state if there is already a transport with the same ipaddr exists in the old asoc. Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Link: https://lore.kernel.org/r/fd17356abe49713ded425250cc1ae51e9f5846c6.1696172325.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04Merge branch '40GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-10-03 (i40e, iavf) This series contains updates to i40e and iavf drivers. Yajun Deng aligns reporting of buffer exhaustion statistics to follow documentation for i40e. Jake removes undesired 'inline' from functions in iavf. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: iavf: remove "inline" functions from iavf_txrx.c i40e: Add rx_missed_errors for buffer exhaustion ==================== Link: https://lore.kernel.org/r/20231003223610.2004976-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04Merge branch ↵Jakub Kicinski
'fix-a-couple-recent-instances-of-wincompatible-function-pointer-types-strict-from-mode_get-implementations' Nathan Chancellor says: ==================== Fix a couple recent instances of -Wincompatible-function-pointer-types-strict from ->mode_get() implementations This series fixes a couple of instances of -Wincompatible-function-pointer-types-strict that were introduced by a recent series that added a new type of ops, struct dpll_device_ops, along with implementations of the callback ->mode_get() that had a mismatched mode type. This warning is not currently enabled for any build but I am planning on submitting a patch to add it to W=1 to prevent new instances of the warning from popping up while we try and fix the existing instances in other drivers. This series is based on current net-next but if they need to go into individual maintainer trees, please feel free to take the patches individually. ==================== Link: https://lore.kernel.org/r/20231002-net-wifpts-dpll_mode_get-v1-0-a356a16413cf@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04mlx5: Fix type of mode parameter in mlx5_dpll_device_mode_get()Nathan Chancellor
When building with -Wincompatible-function-pointer-types-strict, a warning designed to catch potential kCFI failures at build time rather than run time due to incorrect function pointer types, there is a warning due to a mismatch between the type of the mode parameter in mlx5_dpll_device_mode_get() vs. what the function pointer prototype for ->mode_get() in 'struct dpll_device_ops' expects. drivers/net/ethernet/mellanox/mlx5/core/dpll.c:141:14: error: incompatible function pointer types initializing 'int (*)(const struct dpll_device *, void *, enum dpll_mode *, struct netlink_ext_ack *)' with an expression of type 'int (const struct dpll_device *, void *, u32 *, struct netlink_ext_ack *)' (aka 'int (const struct dpll_device *, void *, unsigned int *, struct netlink_ext_ack *)') [-Werror,-Wincompatible-function-pointer-types-strict] 141 | .mode_get = mlx5_dpll_device_mode_get, | ^~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. Change the type of the mode parameter in mlx5_dpll_device_mode_get() to clear up the warning and avoid kCFI failures at run time. Fixes: 496fd0a26bbf ("mlx5: Implement SyncE support using DPLL infrastructure") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20231002-net-wifpts-dpll_mode_get-v1-2-a356a16413cf@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04ptp: Fix type of mode parameter in ptp_ocp_dpll_mode_get()Nathan Chancellor
When building with -Wincompatible-function-pointer-types-strict, a warning designed to catch potential kCFI failures at build time rather than run time due to incorrect function pointer types, there is a warning due to a mismatch between the type of the mode parameter in ptp_ocp_dpll_mode_get() vs. what the function pointer prototype for ->mode_get() in 'struct dpll_device_ops' expects. drivers/ptp/ptp_ocp.c:4353:14: error: incompatible function pointer types initializing 'int (*)(const struct dpll_device *, void *, enum dpll_mode *, struct netlink_ext_ack *)' with an expression of type 'int (const struct dpll_device *, void *, u32 *, struct netlink_ext_ack *)' (aka 'int (const struct dpll_device *, void *, unsigned int *, struct netlink_ext_ack *)') [-Werror,-Wincompatible-function-pointer-types-strict] 4353 | .mode_get = ptp_ocp_dpll_mode_get, | ^~~~~~~~~~~~~~~~~~~~~ 1 error generated. Change the type of the mode parameter in ptp_ocp_dpll_mode_get() to clear up the warning and avoid kCFI failures at run time. Fixes: 09eeb3aecc6c ("ptp_ocp: implement DPLL ops") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20231002-net-wifpts-dpll_mode_get-v1-1-a356a16413cf@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04Merge branch 'r8152-modify-rx_bottom'Jakub Kicinski
Hayes Wang says: ==================== r8152: modify rx_bottom v3: For patch #1, this patch is replaced. The new patch only break the loop, and keep that the driver would queue the rx packets. For patch #2, modify the code depends on patch #1. For work_down < budget, napi_get_frags() and napi_gro_frags() would be used. For the others, nothing is changed. v2: For patch #1, add comment, update commit message, and add Fixes tag. v1: These patches are used to improve rx_bottom(). ==================== Link: https://lore.kernel.org/r/20230926111714.9448-432-nic_swsd@realtek.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04r8152: use napi_gro_fragsHayes Wang
Use napi_gro_frags() for the skb of fragments when the work_done is less than budget. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Link: https://lore.kernel.org/r/20230926111714.9448-434-nic_swsd@realtek.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04r8152: break the loop when the budget is exhaustedHayes Wang
A bulk transfer of the USB may contain many packets. And, the total number of the packets in the bulk transfer may be more than budget. Originally, only budget packets would be handled by napi_gro_receive(), and the other packets would be queued in the driver for next schedule. This patch would break the loop about getting next bulk transfer, when the budget is exhausted. That is, only the current bulk transfer would be handled, and the other bulk transfers would be queued for next schedule. Besides, the packets which are more than the budget in the current bulk trasnfer would be still queued in the driver, as the original method. In addition, a bulk transfer wouldn't contain more than 400 packets, so the check of queue length is unnecessary. Therefore, I replace it with WARN_ON_ONCE(). Fixes: cf74eb5a5bc8 ("eth: r8152: try to use a normal budget") Signed-off-by: Hayes Wang <hayeswang@realtek.com> Link: https://lore.kernel.org/r/20230926111714.9448-433-nic_swsd@realtek.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04Merge branch 'chelsio-annotate-structs-with-__counted_by'Jakub Kicinski
Kees Cook says: ==================== chelsio: Annotate structs with __counted_by This annotates several chelsio structures with the coming __counted_by attribute for bounds checking of flexible arrays at run-time. For more details, see commit dd06e72e68bc ("Compiler Attributes: Add __counted_by macro"). ==================== Link: https://lore.kernel.org/r/20230929181042.work.990-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04cxgb4: Annotate struct smt_data with __counted_byKees Cook
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct smt_data. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20230929181149.3006432-5-keescook@chromium.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04cxgb4: Annotate struct sched_table with __counted_byKees Cook
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct sched_table. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20230929181149.3006432-4-keescook@chromium.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04cxgb4: Annotate struct cxgb4_tc_u32_table with __counted_byKees Cook
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct cxgb4_tc_u32_table. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20230929181149.3006432-3-keescook@chromium.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04cxgb4: Annotate struct clip_tbl with __counted_byKees Cook
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct clip_tbl. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20230929181149.3006432-2-keescook@chromium.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04chelsio/l2t: Annotate struct l2t_data with __counted_byKees Cook
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct l2t_data. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20230929181149.3006432-1-keescook@chromium.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04tcp: fix delayed ACKs for MSS boundary conditionNeal Cardwell
This commit fixes poor delayed ACK behavior that can cause poor TCP latency in a particular boundary condition: when an application makes a TCP socket write that is an exact multiple of the MSS size. The problem is that there is painful boundary discontinuity in the current delayed ACK behavior. With the current delayed ACK behavior, we have: (1) If an app reads data when > 1*MSS is unacknowledged, then tcp_cleanup_rbuf() ACKs immediately because of: tp->rcv_nxt - tp->rcv_wup > icsk->icsk_ack.rcv_mss || (2) If an app reads all received data, and the packets were < 1*MSS, and either (a) the app is not ping-pong or (b) we received two packets < 1*MSS, then tcp_cleanup_rbuf() ACKs immediately beecause of: ((icsk->icsk_ack.pending & ICSK_ACK_PUSHED2) || ((icsk->icsk_ack.pending & ICSK_ACK_PUSHED) && !inet_csk_in_pingpong_mode(sk))) && (3) *However*: if an app reads exactly 1*MSS of data, tcp_cleanup_rbuf() does not send an immediate ACK. This is true even if the app is not ping-pong and the 1*MSS of data had the PSH bit set, suggesting the sending application completed an application write. Thus if the app is not ping-pong, we have this painful case where >1*MSS gets an immediate ACK, and <1*MSS gets an immediate ACK, but a write whose last skb is an exact multiple of 1*MSS can get a 40ms delayed ACK. This means that any app that transfers data in one direction and takes care to align write size or packet size with MSS can suffer this problem. With receive zero copy making 4KB MSS values more common, it is becoming more common to have application writes naturally align with MSS, and more applications are likely to encounter this delayed ACK problem. The fix in this commit is to refine the delayed ACK heuristics with a simple check: immediately ACK a received 1*MSS skb with PSH bit set if the app reads all data. Why? If an skb has a len of exactly 1*MSS and has the PSH bit set then it is likely the end of an application write. So more data may not be arriving soon, and yet the data sender may be waiting for an ACK if cwnd-bound or using TX zero copy. Thus we set ICSK_ACK_PUSHED in this case so that tcp_cleanup_rbuf() will send an ACK immediately if the app reads all of the data and is not ping-pong. Note that this logic is also executed for the case where len > MSS, but in that case this logic does not matter (and does not hurt) because tcp_cleanup_rbuf() will always ACK immediately if the app reads data and there is more than an MSS of unACKed data. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Yuchung Cheng <ycheng@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Cc: Xin Guo <guoxin0309@gmail.com> Link: https://lore.kernel.org/r/20231001151239.1866845-2-ncardwell.sw@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04tcp: fix quick-ack counting to count actual ACKs of new dataNeal Cardwell
This commit fixes quick-ack counting so that it only considers that a quick-ack has been provided if we are sending an ACK that newly acknowledges data. The code was erroneously using the number of data segments in outgoing skbs when deciding how many quick-ack credits to remove. This logic does not make sense, and could cause poor performance in request-response workloads, like RPC traffic, where requests or responses can be multi-segment skbs. When a TCP connection decides to send N quick-acks, that is to accelerate the cwnd growth of the congestion control module controlling the remote endpoint of the TCP connection. That quick-ack decision is purely about the incoming data and outgoing ACKs. It has nothing to do with the outgoing data or the size of outgoing data. And in particular, an ACK only serves the intended purpose of allowing the remote congestion control to grow the congestion window quickly if the ACK is ACKing or SACKing new data. The fix is simple: only count packets as serving the goal of the quickack mechanism if they are ACKing/SACKing new data. We can tell whether this is the case by checking inet_csk_ack_scheduled(), since we schedule an ACK exactly when we are ACKing/SACKing new data. Fixes: fc6415bcb0f5 ("[TCP]: Fix quick-ack decrementing with TSO.") Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Yuchung Cheng <ycheng@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20231001151239.1866845-1-ncardwell.sw@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-05pmdomain: imx: scu-pd: correct DMA2 channelPeng Fan
Per "dt-bindings/firmware/imx/rsrc.h", `IMX_SC_R_DMA_2_CH0 + 5` not equals to IMX_SC_R_DMA_2_CH5, so there should be two entries in imx8qxp_scu_pd_ranges, otherwise the imx_scu_add_pm_domain may filter out wrong power domains. Fixes: 927b7d15dcf2 ("genpd: imx: scu-pd: enlarge PD range") Reported-by: Dong Aisheng <Aisheng.dong@nxp.com> Signed-off-by: Peng Fan <peng.fan@nxp.com> Link: https://lore.kernel.org/r/20231001123853.200773-1-peng.fan@oss.nxp.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2023-10-04Merge tag 'nf-23-10-04' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Florian Westphal says: ==================== netfilter patches for net First patch resolves a regression with vlan header matching, this was broken since 6.5 release. From myself. Second patch fixes an ancient problem with sctp connection tracking in case INIT_ACK packets are delayed. This comes with a selftest, both patches from Xin Long. Patch 4 extends the existing nftables audit selftest, from Phil Sutter. Patch 5, also from Phil, avoids a situation where nftables would emit an audit record twice. This was broken since 5.13 days. Patch 6, from myself, avoids spurious insertion failure if we encounter an overlapping but expired range during element insertion with the 'nft_set_rbtree' backend. This problem exists since 6.2. * tag 'nf-23-10-04' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_tables: nft_set_rbtree: fix spurious insertion failure netfilter: nf_tables: Deduplicate nft_register_obj audit logs selftests: netfilter: Extend nft_audit.sh selftests: netfilter: test for sctp collision processing in nf_conntrack netfilter: handle the connecting collision properly in nf_conntrack_proto_sctp netfilter: nft_payload: rebuild vlan header on h_proto access ==================== Link: https://lore.kernel.org/r/20231004141405.28749-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04Merge tag 'nf-next-23-09-28' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Florian Westphal says: ==================== netfilter updates for net-next First patch, from myself, is a bug fix. The issue (connect timeout) is ancient, so I think its safe to give this more soak time given the esoteric conditions needed to trigger this. Also updates the existing selftest to cover this. Add netlink extacks when an update references a non-existent table/chain/set. This allows userspace to provide much better errors to the user, from Pablo Neira Ayuso. Last patch adds more policy checks to nf_tables as a better alternative to the existing runtime checks, from Phil Sutter. * tag 'nf-next-23-09-28' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: nf_tables: Utilize NLA_POLICY_NESTED_ARRAY netfilter: nf_tables: missing extended netlink error in lookup functions selftests: netfilter: test nat source port clash resolution interaction with tcp early demux netfilter: nf_nat: undo erroneous tcp edemux lookup after port clash ==================== Link: https://lore.kernel.org/r/20230928144916.18339-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04page_pool: fix documentation typosRandy Dunlap
Correct grammar for better readability. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Link: https://lore.kernel.org/r/20231001003846.29541-1-rdunlap@infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04smb: client: do not start laundromat thread on nohandlecachePaulo Alcantara
Honor 'nohandlecache' mount option by not starting laundromat thread even when SMB server supports directory leases. Do not waste system resources by having laundromat thread running with no directory caching at all. Fixes: 2da338ff752a ("smb3: do not start laundromat thread when dir leases disabled") Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-10-04smb: use kernel_connect() and kernel_bind()Jordan Rife
Recent changes to kernel_connect() and kernel_bind() ensure that callers are insulated from changes to the address parameter made by BPF SOCK_ADDR hooks. This patch wraps direct calls to ops->connect() and ops->bind() with kernel_connect() and kernel_bind() to ensure that SMB mounts do not see their mount address overwritten in such cases. Link: https://lore.kernel.org/netdev/9944248dba1bce861375fcce9de663934d933ba9.camel@redhat.com/ Cc: <stable@vger.kernel.org> # 6.0+ Signed-off-by: Jordan Rife <jrife@google.com> Acked-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-10-04sctp: Spelling s/preceeding/preceding/gGeert Uytterhoeven
Fix a misspelling of "preceding". Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/663b14d07d6d716ddc34482834d6b65a2f714cfb.1695903447.git.geert+renesas@glider.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04Merge branch 'selftest/bpf, riscv: Improved cross-building support'Andrii Nakryiko
Björn Töpel says: ==================== From: Björn Töpel <bjorn@rivosinc.com> Yet another "more cross-building support for RISC-V" series. An example how to invoke a gen_tar build: | make ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- CC=riscv64-linux-gnu-gcc \ | HOSTCC=gcc O=/workspace/kbuild FORMAT= \ | SKIP_TARGETS="arm64 ia64 powerpc sparc64 x86 sgx" -j $(($(nproc)-1)) \ | -C tools/testing/selftests gen_tar Björn ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2023-10-04selftests/bpf: Add uprobe_multi to gen_tar targetBjörn Töpel
The uprobe_multi program was not picked up for the gen_tar target. Fix by adding it to TEST_GEN_FILES. Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20231004122721.54525-4-bjorn@kernel.org
2023-10-04selftests/bpf: Enable lld usage for RISC-VBjörn Töpel
RISC-V has proper lld support. Use that, similar to what x86 does, for urandom_read et al. Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20231004122721.54525-3-bjorn@kernel.org
2023-10-04selftests/bpf: Add cross-build support for urandom_read et alBjörn Töpel
Some userland programs in the BPF test suite, e.g. urandom_read, is missing cross-build support. Add cross-build support for these programs Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20231004122721.54525-2-bjorn@kernel.org
2023-10-04tipc: fix a potential deadlock on &tx->lockChengfeng Ye
It seems that tipc_crypto_key_revoke() could be be invoked by wokequeue tipc_crypto_work_rx() under process context and timer/rx callback under softirq context, thus the lock acquisition on &tx->lock seems better use spin_lock_bh() to prevent possible deadlock. This flaw was found by an experimental static analysis tool I am developing for irq-related deadlock. tipc_crypto_work_rx() <workqueue> --> tipc_crypto_key_distr() --> tipc_bcast_xmit() --> tipc_bcbase_xmit() --> tipc_bearer_bc_xmit() --> tipc_crypto_xmit() --> tipc_ehdr_build() --> tipc_crypto_key_revoke() --> spin_lock(&tx->lock) <timer interrupt> --> tipc_disc_timeout() --> tipc_bearer_xmit_skb() --> tipc_crypto_xmit() --> tipc_ehdr_build() --> tipc_crypto_key_revoke() --> spin_lock(&tx->lock) <deadlock here> Signed-off-by: Chengfeng Ye <dg573847474@gmail.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication") Link: https://lore.kernel.org/r/20230927181414.59928-1-dg573847474@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04net: stmmac: dwmac-stm32: fix resume on STM32 MCUBen Wolsieffer
The STM32MP1 keeps clk_rx enabled during suspend, and therefore the driver does not enable the clock in stm32_dwmac_init() if the device was suspended. The problem is that this same code runs on STM32 MCUs, which do disable clk_rx during suspend, causing the clock to never be re-enabled on resume. This patch adds a variant flag to indicate that clk_rx remains enabled during suspend, and uses this to decide whether to enable the clock in stm32_dwmac_init() if the device was suspended. This approach fixes this specific bug with limited opportunity for unintended side-effects, but I have a follow up patch that will refactor the clock configuration and hopefully make it less error prone. Fixes: 6528e02cc9ff ("net: ethernet: stmmac: add adaptation for stm32mp157c.") Signed-off-by: Ben Wolsieffer <ben.wolsieffer@hefring.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20230927175749.1419774-1-ben.wolsieffer@hefring.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-04Merge branch 'libbpf/selftests syscall wrapper fixes for RISC-V'Andrii Nakryiko
Björn Töpel says: ==================== From: Björn Töpel <bjorn@rivosinc.com> Commit 08d0ce30e0e4 ("riscv: Implement syscall wrappers") introduced some regressions in libbpf, and the kselftests BPF suite, which are fixed with these three patches. Note that there's an outstanding fix [1] for ftrace syscall tracing which is also a fallout from the commit above. Björn [1] https://lore.kernel.org/linux-riscv/20231003182407.32198-1-alexghiti@rivosinc.com/ Alexandre Ghiti (1): libbpf: Fix syscall access arguments on riscv ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2023-10-04selftests/bpf: Define SYS_NANOSLEEP_KPROBE_NAME for riscvBjörn Töpel
Add missing sys_nanosleep name for RISC-V, which is used by some tests (e.g. attach_probe). Fixes: 08d0ce30e0e4 ("riscv: Implement syscall wrappers") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/bpf/20231004110905.49024-4-bjorn@kernel.org
2023-10-04selftests/bpf: Define SYS_PREFIX for riscvBjörn Töpel
SYS_PREFIX was missing for a RISC-V, which made a couple of kprobe tests fail. Add missing SYS_PREFIX for RISC-V. Fixes: 08d0ce30e0e4 ("riscv: Implement syscall wrappers") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/bpf/20231004110905.49024-3-bjorn@kernel.org
2023-10-04libbpf: Fix syscall access arguments on riscvAlexandre Ghiti
Since commit 08d0ce30e0e4 ("riscv: Implement syscall wrappers"), riscv selects ARCH_HAS_SYSCALL_WRAPPER so let's use the generic implementation of PT_REGS_SYSCALL_REGS(). Fixes: 08d0ce30e0e4 ("riscv: Implement syscall wrappers") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/bpf/20231004110905.49024-2-bjorn@kernel.org
2023-10-04ptp: ocp: fix error code in probe()Dan Carpenter
There is a copy and paste error so this uses a valid pointer instead of an error pointer. Fixes: 09eeb3aecc6c ("ptp_ocp: implement DPLL ops") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://lore.kernel.org/r/5c581336-0641-48bd-88f7-51984c3b1f79@moroto.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>