summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-08-07function_graph: Fix the ret_stack used by ftrace_graph_ret_addr()Petr Pavlu
When ftrace_graph_ret_addr() is invoked to convert a found stack return address to its original value, the function can end up producing the following crash: [ 95.442712] BUG: kernel NULL pointer dereference, address: 0000000000000028 [ 95.442720] #PF: supervisor read access in kernel mode [ 95.442724] #PF: error_code(0x0000) - not-present page [ 95.442727] PGD 0 P4D 0- [ 95.442731] Oops: Oops: 0000 [#1] PREEMPT SMP PTI [ 95.442736] CPU: 1 UID: 0 PID: 2214 Comm: insmod Kdump: loaded Tainted: G OE K 6.11.0-rc1-default #1 67c62a3b3720562f7e7db5f11c1fdb40b7a2857c [ 95.442747] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE, [K]=LIVEPATCH [ 95.442750] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 [ 95.442754] RIP: 0010:ftrace_graph_ret_addr+0x42/0xc0 [ 95.442766] Code: [...] [ 95.442773] RSP: 0018:ffff979b80ff7718 EFLAGS: 00010006 [ 95.442776] RAX: ffffffff8ca99b10 RBX: ffff979b80ff7760 RCX: ffff979b80167dc0 [ 95.442780] RDX: ffffffff8ca99b10 RSI: ffff979b80ff7790 RDI: 0000000000000005 [ 95.442783] RBP: 0000000000000001 R08: 0000000000000005 R09: 0000000000000000 [ 95.442786] R10: 0000000000000005 R11: 0000000000000000 R12: ffffffff8e9491e0 [ 95.442790] R13: ffffffff8d6f70f0 R14: ffff979b80167da8 R15: ffff979b80167dc8 [ 95.442793] FS: 00007fbf83895740(0000) GS:ffff8a0afdd00000(0000) knlGS:0000000000000000 [ 95.442797] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 95.442800] CR2: 0000000000000028 CR3: 0000000005070002 CR4: 0000000000370ef0 [ 95.442806] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 95.442809] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 95.442816] Call Trace: [ 95.442823] <TASK> [ 95.442896] unwind_next_frame+0x20d/0x830 [ 95.442905] arch_stack_walk_reliable+0x94/0xe0 [ 95.442917] stack_trace_save_tsk_reliable+0x7d/0xe0 [ 95.442922] klp_check_and_switch_task+0x55/0x1a0 [ 95.442931] task_call_func+0xd3/0xe0 [ 95.442938] klp_try_switch_task.part.5+0x37/0x150 [ 95.442942] klp_try_complete_transition+0x79/0x2d0 [ 95.442947] klp_enable_patch+0x4db/0x890 [ 95.442960] do_one_initcall+0x41/0x2e0 [ 95.442968] do_init_module+0x60/0x220 [ 95.442975] load_module+0x1ebf/0x1fb0 [ 95.443004] init_module_from_file+0x88/0xc0 [ 95.443010] idempotent_init_module+0x190/0x240 [ 95.443015] __x64_sys_finit_module+0x5b/0xc0 [ 95.443019] do_syscall_64+0x74/0x160 [ 95.443232] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 95.443236] RIP: 0033:0x7fbf82f2c709 [ 95.443241] Code: [...] [ 95.443247] RSP: 002b:00007fffd5ea3b88 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 95.443253] RAX: ffffffffffffffda RBX: 000056359c48e750 RCX: 00007fbf82f2c709 [ 95.443257] RDX: 0000000000000000 RSI: 000056356ed4efc5 RDI: 0000000000000003 [ 95.443260] RBP: 000056356ed4efc5 R08: 0000000000000000 R09: 00007fffd5ea3c10 [ 95.443263] R10: 0000000000000003 R11: 0000000000000246 R12: 0000000000000000 [ 95.443267] R13: 000056359c48e6f0 R14: 0000000000000000 R15: 0000000000000000 [ 95.443272] </TASK> [ 95.443274] Modules linked in: [...] [ 95.443385] Unloaded tainted modules: intel_uncore_frequency(E):1 isst_if_common(E):1 skx_edac(E):1 [ 95.443414] CR2: 0000000000000028 The bug can be reproduced with kselftests: cd linux/tools/testing/selftests make TARGETS='ftrace livepatch' (cd ftrace; ./ftracetest test.d/ftrace/fgraph-filter.tc) (cd livepatch; ./test-livepatch.sh) The problem is that ftrace_graph_ret_addr() is supposed to operate on the ret_stack of a selected task but wrongly accesses the ret_stack of the current task. Specifically, the above NULL dereference occurs when task->curr_ret_stack is non-zero, but current->ret_stack is NULL. Correct ftrace_graph_ret_addr() to work with the right ret_stack. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Reported-by: Miroslav Benes <mbenes@suse.cz> Link: https://lore.kernel.org/20240803131211.17255-1-petr.pavlu@suse.com Fixes: 7aa1eaef9f42 ("function_graph: Allow multiple users to attach to function graph") Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-08-07eventfs: Use SRCU for freeing eventfs_inodesMathias Krause
To mirror the SRCU lock held in eventfs_iterate() when iterating over eventfs inodes, use call_srcu() to free them too. This was accidentally(?) degraded to RCU in commit 43aa6f97c2d0 ("eventfs: Get rid of dentry pointers without refcounts"). Cc: Ajay Kaher <ajay.kaher@broadcom.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/20240723210755.8970-1-minipli@grsecurity.net Fixes: 43aa6f97c2d0 ("eventfs: Get rid of dentry pointers without refcounts") Signed-off-by: Mathias Krause <minipli@grsecurity.net> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-08-07eventfs: Don't return NULL in eventfs_create_dir()Mathias Krause
Commit 77a06c33a22d ("eventfs: Test for ei->is_freed when accessing ei->dentry") added another check, testing if the parent was freed after we released the mutex. If so, the function returns NULL. However, all callers expect it to either return a valid pointer or an error pointer, at least since commit 5264a2f4bb3b ("tracing: Fix a NULL vs IS_ERR() bug in event_subsystem_dir()"). Returning NULL will therefore fail the error condition check in the caller. Fix this by substituting the NULL return value with a fitting error pointer. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: stable@vger.kernel.org Fixes: 77a06c33a22d ("eventfs: Test for ei->is_freed when accessing ei->dentry") Link: https://lore.kernel.org/20240723122522.2724-1-minipli@grsecurity.net Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Ajay Kaher <ajay.kaher@broadcom.com> Signed-off-by: Mathias Krause <minipli@grsecurity.net> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-08-07tracefs: Fix inode allocationMathias Krause
The leading comment above alloc_inode_sb() is pretty explicit about it: /* * This must be used for allocating filesystems specific inodes to set * up the inode reclaim context correctly. */ Switch tracefs over to alloc_inode_sb() to make sure inodes are properly linked. Cc: Ajay Kaher <ajay.kaher@broadcom.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/20240807115143.45927-2-minipli@grsecurity.net Fixes: ba37ff75e04b ("eventfs: Implement tracefs_inode_cache") Signed-off-by: Mathias Krause <minipli@grsecurity.net> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-08-07tracing: Use refcount for trace_event_file reference counterSteven Rostedt
Instead of using an atomic counter for the trace_event_file reference counter, use the refcount interface. It has various checks to make sure the reference counting is correct, and will warn if it detects an error (like refcount_inc() on '0'). Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20240726144208.687cce24@rorschach.local.home Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-08-07tracing: Have format file honor EVENT_FILE_FL_FREEDSteven Rostedt
When eventfs was introduced, special care had to be done to coordinate the freeing of the file meta data with the files that are exposed to user space. The file meta data would have a ref count that is set when the file is created and would be decremented and freed after the last user that opened the file closed it. When the file meta data was to be freed, it would set a flag (EVENT_FILE_FL_FREED) to denote that the file is freed, and any new references made (like new opens or reads) would fail as it is marked freed. This allowed other meta data to be freed after this flag was set (under the event_mutex). All the files that were dynamically created in the events directory had a pointer to the file meta data and would call event_release() when the last reference to the user space file was closed. This would be the time that it is safe to free the file meta data. A shortcut was made for the "format" file. It's i_private would point to the "call" entry directly and not point to the file's meta data. This is because all format files are the same for the same "call", so it was thought there was no reason to differentiate them. The other files maintain state (like the "enable", "trigger", etc). But this meant if the file were to disappear, the "format" file would be unaware of it. This caused a race that could be trigger via the user_events test (that would create dynamic events and free them), and running a loop that would read the user_events format files: In one console run: # cd tools/testing/selftests/user_events # while true; do ./ftrace_test; done And in another console run: # cd /sys/kernel/tracing/ # while true; do cat events/user_events/__test_event/format; done 2>/dev/null With KASAN memory checking, it would trigger a use-after-free bug report (which was a real bug). This was because the format file was not checking the file's meta data flag "EVENT_FILE_FL_FREED", so it would access the event that the file meta data pointed to after the event was freed. After inspection, there are other locations that were found to not check the EVENT_FILE_FL_FREED flag when accessing the trace_event_file. Add a new helper function: event_file_file() that will make sure that the event_mutex is held, and will return NULL if the trace_event_file has the EVENT_FILE_FL_FREED flag set. Have the first reference of the struct file pointer use event_file_file() and check for NULL. Later uses can still use the event_file_data() helper function if the event_mutex is still held and was not released since the event_file_file() call. Link: https://lore.kernel.org/all/20240719204701.1605950-1-minipli@grsecurity.net/ Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Ajay Kaher <ajay.kaher@broadcom.com> Cc: Ilkka Naulapää <digirigawa@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: Beau Belgrave <beaub@linux.microsoft.com> Cc: Florian Fainelli <florian.fainelli@broadcom.com> Cc: Alexey Makhalov <alexey.makhalov@broadcom.com> Cc: Vasavi Sirnapalli <vasavi.sirnapalli@broadcom.com> Link: https://lore.kernel.org/20240730110657.3b69d3c1@gandalf.local.home Fixes: b63db58e2fa5d ("eventfs/tracing: Add callback for release of an eventfs_inode") Reported-by: Mathias Krause <minipli@grsecurity.net> Tested-by: Mathias Krause <minipli@grsecurity.net> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-08-07Bluetooth: hci_sync: avoid dup filtering when passive scanning with adv monitorAnton Khirnov
This restores behaviour (including the comment) from now-removed hci_request.c, and also matches existing code for active scanning. Without this, the duplicates filter is always active when passive scanning, which makes it impossible to work with devices that send nontrivial dynamic data in their advertisement reports. Fixes: abfeea476c68 ("Bluetooth: hci_sync: Convert MGMT_OP_START_DISCOVERY") Signed-off-by: Anton Khirnov <anton@khirnov.net> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-08-07Bluetooth: l2cap: always unlock channel in l2cap_conless_channel()Dmitry Antipov
Add missing call to 'l2cap_chan_unlock()' on receive error handling path in 'l2cap_conless_channel()'. Fixes: a24cce144b98 ("Bluetooth: Fix reference counting of global L2CAP channels") Reported-by: syzbot+45ac74737e866894acb0@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=45ac74737e866894acb0 Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-08-07Bluetooth: hci_qca: fix a NULL-pointer derefence at shutdownBartosz Golaszewski
Unlike qca_regulator_init(), qca_power_shutdown() may be called for QCA_ROME which does not have qcadev->bt_power assigned. Add a NULL-pointer check before dereferencing the struct qca_power pointer. Fixes: eba1718717b0 ("Bluetooth: hci_qca: make pwrseq calls the default if available") Reported-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Closes: https://lore.kernel.org/linux-bluetooth/su3wp6s44hrxf4ijvsdfzbvv4unu4ycb7kkvwbx6ltdafkldir@4g7ydqm2ap5j/ Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-08-07Bluetooth: hci_qca: fix QCA6390 support on non-DT platformsBartosz Golaszewski
QCA6390 can albo be used on non-DT systems so we must not make the power sequencing the only option. Check if the serdev device consumes an OF node. If so: honor the new contract as per the DT bindings. If not: fall back to the previous behavior by falling through to the existing default label. Fixes: 9a15ce685706 ("Bluetooth: qca: use the power sequencer for QCA6390") Reported-by: Wren Turkal <wt@penguintechs.org> Closes: https://lore.kernel.org/linux-bluetooth/27e6a6c5-fb63-4219-be0b-eefa2c116e06@penguintechs.org/ Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-08-07Bluetooth: hci_qca: don't call pwrseq_power_off() twice for QCA6390Bartosz Golaszewski
Now that we call pwrseq_power_off() for all models that hold a valid power sequencing handle, we can remove the switch case for QCA_6390. The switch will now use the default label for this model but that's fine: if it has the BT-enable GPIO than we should use it. Fixes: eba1718717b0 ("Bluetooth: hci_qca: make pwrseq calls the default if available") Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-08-07igc: Fix qbv tx latency by setting gtxoffsetFaizal Rahim
A large tx latency issue was discovered during testing when only QBV was enabled. The issue occurs because gtxoffset was not set when QBV is active, it was only set when launch time is active. The patch "igc: Correct the launchtime offset" only sets gtxoffset when the launchtime_enable field is set by the user. Enabling launchtime_enable ultimately sets the register IGC_TXQCTL_QUEUE_MODE_LAUNCHT (referred to as LaunchT in the SW user manual). Section 7.5.2.6 of the IGC i225/6 SW User Manual Rev 1.2.4 states: "The latency between transmission scheduling (launch time) and the time the packet is transmitted to the network is listed in Table 7-61." However, the patch misinterprets the phrase "launch time" in that section by assuming it specifically refers to the LaunchT register, whereas it actually denotes the generic term for when a packet is released from the internal buffer to the MAC transmit logic. This launch time, as per that section, also implicitly refers to the QBV gate open time, where a packet waits in the buffer for the QBV gate to open. Therefore, latency applies whenever QBV is in use. TSN features such as QBU and QAV reuse QBV, making the latency universal to TSN features. Discussed with i226 HW owner (Shalev, Avi) and we were in agreement that the term "launch time" used in Section 7.5.2.6 is not clear and can be easily misinterpreted. Avi will update this section to: "When TQAVCTRL.TRANSMIT_MODE = TSN, the latency between transmission scheduling and the time the packet is transmitted to the network is listed in Table 7-61." Fix this issue by using igc_tsn_is_tx_mode_in_tsn() as a condition to write to gtxoffset, aligning with the newly updated SW User Manual. Tested: 1. Enrol taprio on talker board base-time 0 cycle-time 1000000 flags 0x2 index 0 cmd S gatemask 0x1 interval1 index 0 cmd S gatemask 0x1 interval2 Note: interval1 = interval for a 64 bytes packet to go through interval2 = cycle-time - interval1 2. Take tcpdump on listener board 3. Use udp tai app on talker to send packets to listener 4. Check the timestamp on listener via wireshark Test Result: 100 Mbps: 113 ~193 ns 1000 Mbps: 52 ~ 84 ns 2500 Mbps: 95 ~ 223 ns Note that the test result is similar to the patch "igc: Correct the launchtime offset". Fixes: 790835fcc0cb ("igc: Correct the launchtime offset") Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-08-07igc: Fix reset adapter logics when tx mode changeFaizal Rahim
Following the "igc: Fix TX Hang issue when QBV Gate is close" changes, remaining issues with the reset adapter logic in igc_tsn_offload_apply() have been observed: 1. The reset adapter logics for i225 and i226 differ, although they should be the same according to the guidelines in I225/6 HW Design Section 7.5.2.1 on software initialization during tx mode changes. 2. The i225 resets adapter every time, even though tx mode doesn't change. This occurs solely based on the condition igc_is_device_id_i225() when calling schedule_work(). 3. i226 doesn't reset adapter for tsn->legacy tx mode changes. It only resets adapter for legacy->tsn tx mode transitions. 4. qbv_count introduced in the patch is actually not needed; in this context, a non-zero value of qbv_count is used to indicate if tx mode was unconditionally set to tsn in igc_tsn_enable_offload(). This could be replaced by checking the existing register IGC_TQAVCTRL_TRANSMIT_MODE_TSN bit. This patch resolves all issues and enters schedule_work() to reset the adapter only when changing tx mode. It also removes reliance on qbv_count. qbv_count field will be removed in a future patch. Test ran: 1. Verify reset adapter behaviour in i225/6: a) Enrol a new GCL Reset adapter observed (tx mode change legacy->tsn) b) Enrol a new GCL without deleting qdisc No reset adapter observed (tx mode remain tsn->tsn) c) Delete qdisc Reset adapter observed (tx mode change tsn->legacy) 2. Tested scenario from "igc: Fix TX Hang issue when QBV Gate is closed" to confirm it remains resolved. Fixes: 175c241288c0 ("igc: Fix TX Hang issue when QBV Gate is closed") Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-08-07igc: Fix qbv_config_change_errors logicsFaizal Rahim
When user issues these cmds: 1. Either a) or b) a) mqprio with hardware offload disabled b) taprio with txtime-assist feature enabled 2. etf 3. tc qdisc delete 4. taprio with base time in the past At step 4, qbv_config_change_errors wrongly increased by 1. Excerpt from IEEE 802.1Q-2018 8.6.9.3.1: "If AdminBaseTime specifies a time in the past, and the current schedule is running, then: Increment ConfigChangeError counter" qbv_config_change_errors should only increase if base time is in the past and no taprio is active. In user perspective, taprio was not active when first triggered at step 4. However, i225/6 reuses qbv for etf, so qbv is enabled with a dummy schedule at step 2 where it enters igc_tsn_enable_offload() and qbv_count got incremented to 1. At step 4, it enters igc_tsn_enable_offload() again, qbv_count is incremented to 2. Because taprio is running, tc_setup_type is TC_SETUP_QDISC_ETF and qbv_count > 1, qbv_config_change_errors value got incremented. This issue happens due to reliance on qbv_count field where a non-zero value indicates that taprio is running. But qbv_count increases regardless if taprio is triggered by user or by other tsn feature. It does not align with qbv_config_change_errors expectation where it is only concerned with taprio triggered by user. Fixing this by relocating the qbv_config_change_errors logic to igc_save_qbv_schedule(), eliminating reliance on qbv_count and its inaccuracies from i225/6's multiple uses of qbv feature for other TSN features. The new function created: igc_tsn_is_taprio_activated_by_user() uses taprio_offload_enable field to indicate that the current running taprio was triggered by user, instead of triggered by non-qbv feature like etf. Fixes: ae4fe4698300 ("igc: Add qbv_config_change_errors counter") Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-08-07igc: Fix packet still tx after gate close by reducing i226 MAC retry bufferFaizal Rahim
Testing uncovered that even when the taprio gate is closed, some packets still transmit. According to i225/6 hardware errata [1], traffic might overflow the planned QBV window. This happens because MAC maintains an internal buffer, primarily for supporting half duplex retries. Therefore, even when the gate closes, residual MAC data in the buffer may still transmit. To mitigate this for i226, reduce the MAC's internal buffer from 192 bytes to the recommended 88 bytes by modifying the RETX_CTL register value. This follows guidelines from: [1] Ethernet Controller I225/I22 Spec Update Rev 2.1 Errata Item 9: TSN: Packet Transmission Might Cross Qbv Window [2] I225/6 SW User Manual Rev 1.2.4: Section 8.11.5 Retry Buffer Control Note that the RETX_CTL register can't be used in TSN mode because half duplex feature cannot coexist with TSN. Test Steps: 1. Send taprio cmd to board A: tc qdisc replace dev enp1s0 parent root handle 100 taprio \ num_tc 4 \ map 3 2 1 0 3 3 3 3 3 3 3 3 3 3 3 3 \ queues 1@0 1@1 1@2 1@3 \ base-time 0 \ sched-entry S 0x07 500000 \ sched-entry S 0x0f 500000 \ flags 0x2 \ txtime-delay 0 Note that for TC3, gate should open for 500us and close for another 500us. 3. Take tcpdump log on Board B. 4. Send udp packets via UDP tai app from Board A to Board B. 5. Analyze tcpdump log via wireshark log on Board B. Ensure that the total time from the first to the last packet received during one cycle for TC3 does not exceed 500us. Fixes: 43546211738e ("igc: Add new device ID's") Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-08-07ice: Fix incorrect assigns of FEC countsMateusz Polchlopek
Commit ac21add2540e ("ice: Implement driver functionality to dump fec statistics") introduces obtaining FEC correctable and uncorrectable stats per netdev in ICE driver. Unfortunately the assignment of values to fec_stats structure has been done incorrectly. This commit fixes the assignments. Fixes: ac21add2540e ("ice: Implement driver functionality to dump fec statistics") Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-08-07ice: Skip PTP HW writes during PTP reset procedureGrzegorz Nitka
Block HW write access for the driver while the device is in reset to avoid potential race condition and access to the PTP HW in non-nominal state which could lead to undesired effects Fixes: 4aad5335969f ("ice: add individual interrupt allocation") Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Co-developed-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Sergey Temerkhanov <sergey.temerkhanov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-08-07ice: Fix reset handlerGrzegorz Nitka
Synchronize OICR IRQ when preparing for reset to avoid potential race conditions between the reset procedure and OICR Fixes: 4aad5335969f ("ice: add individual interrupt allocation") Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Signed-off-by: Sergey Temerkhanov <sergey.temerkhanov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-08-07wifi: rtlwifi: rtl8192du: Initialise value32 in ↵Bitterblue Smith
_rtl92du_init_queue_reserved_page GCC complains: In file included from include/linux/ieee80211.h:21, from include/net/mac80211.h:20, from drivers/net/wireless/realtek/rtlwifi/rtl8192du/../wifi.h:14, from drivers/net/wireless/realtek/rtlwifi/rtl8192du/hw.c:4: In function 'u32p_replace_bits', inlined from '_rtl92du_init_queue_reserved_page.isra' at drivers/net/wireless/realtek/rtlwifi/rtl8192du/hw.c:225:2: >> include/linux/bitfield.h:189:18: warning: 'value32' is used uninitialized [-Wuninitialized] Part of the variable is indeed left uninitialised. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202408062100.DWhN0CYH-lkp@intel.com/ Fixes: e769c67105d3 ("wifi: rtlwifi: Add rtl8192du/hw.{c,h}") Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com> Acked-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/2a808244-93d0-492c-b304-ae1974df5df9@gmail.com
2024-08-07Merge tag 'for-6.11-rc2-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - fix double inode unlock for direct IO sync writes (reported by syzbot) - fix root tree id/name map definitions, don't use fixed size buffers for name (reported by -Werror=unterminated-string-initialization) - fix qgroup reserve leaks in bufferd write path - update scrub status structure more often so it can be reported in user space more accurately and let 'resume' not repeat work - in preparation to remove space cache v1 in the future print a warning if it's detected * tag 'for-6.11-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: avoid using fixed char array size for tree names btrfs: fix double inode unlock for direct IO sync writes btrfs: emit a warning about space cache v1 being deprecated btrfs: fix qgroup reserve leaks in cow_file_range btrfs: implement launder_folio for clearing dirty page reserve btrfs: scrub: update last_physical after scrubbing one stripe btrfs: factor out stripe length calculation into a helper
2024-08-07Merge tag 'for-v6.11-rc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply Pull power supply fixes from Sebastian Reichel: "rt5033: - fix driver regression causing kernel oops axp288-charger: - fix charge voltage setup qcom-battmgr: - fix thermal zone spamming errors - fix init on Qualcomm X Elite" * tag 'for-v6.11-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: power: supply: qcom_battmgr: Ignore extra __le32 in info payload power: supply: qcom_battmgr: return EAGAIN when firmware service is not up power: supply: axp288_charger: Round constant_charge_voltage writes down power: supply: axp288_charger: Fix constant_charge_voltage writes power: supply: rt5033: Bring back i2c_set_clientdata
2024-08-07bcachefs: ec should not allocate from ro devsKent Overstreet
This fixes a device removal deadlock when using erasure coding. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-07bcachefs: Improved allocator debugging for ecKent Overstreet
chasing down a device removal deadlock with erasure coding Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-07bcachefs: Add missing bch2_trans_begin() callKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-07bcachefs: Add a comment for bucket helper typesKent Overstreet
We've had bugs in the past with incorrect integer conversions in disk accounting code, which is why bucket helpers now always return s64s; add a comment explaining this. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-07bcachefs: Don't rely on implicit unsigned -> signed integer conversionKent Overstreet
implicit integer conversion is a fertile source of bugs, and we really would rather not have the min()/max() macros doing it implicitly. bcachefs appears to be the only place in the kernel where this happens, so let's fix it. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-07lockdep: Fix lockdep_set_notrack_class() for CONFIG_LOCK_STATKent Overstreet
We won't find a contended lock if it's not being tracked. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-07LoongArch: KVM: Remove undefined a6 argument comment for kvm_hypercall()Dandan Zhang
The kvm_hypercall() set for LoongArch is limited to a1-a5. So the mention of a6 in the comment is undefined that needs to be rectified. Reviewed-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Wentao Guan <guanwentao@uniontech.com> Signed-off-by: Dandan Zhang <zhangdandan@uniontech.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-08-07LoongArch: KVM: Remove unnecessary definition of KVM_PRIVATE_MEM_SLOTSYuli Wang
1. "KVM_PRIVATE_MEM_SLOTS" is renamed as "KVM_INTERNAL_MEM_SLOTS". 2. "KVM_INTERNAL_MEM_SLOTS" defaults to zero, so it is not necessary to define it in LoongArch's asm/kvm_host.h. Link: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=bdd1c37a315bc50ab14066c4852bc8dcf070451e Link: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=b075450868dbc0950f0942617f222eeb989cad10 Reviewed-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Wentao Guan <guanwentao@uniontech.com> Signed-off-by: Yuli Wang <wangyuli@uniontech.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-08-07LoongArch: Use accessors to page table entries instead of direct dereferenceHuacai Chen
As very well explained in commit 20a004e7b017cce282 ("arm64: mm: Use READ_ONCE/WRITE_ONCE when accessing page tables"), an architecture whose page table walker can modify the PTE in parallel must use READ_ONCE()/ WRITE_ONCE() macro to avoid any compiler transformation. So apply that to LoongArch which is such an architecture, in order to avoid potential problems. Similar to commit edf955647269422e ("riscv: Use accessors to page table entries instead of direct dereference"). Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-08-07LoongArch: Enable general EFI poweroff methodMiao Wang
efi_shutdown_init() can register a general sys_off handler named efi_power_off(). Enable this by providing efi_poweroff_required(), like arm and x86. Since EFI poweroff is also supported on LoongArch, and the enablement makes the poweroff function usable for hardwares which lack ACPI S5. We prefer ACPI poweroff rather than EFI poweroff (like x86), so we only require EFI poweroff if acpi_gbl_reduced_hardware or acpi_no_s5 is true. Cc: stable@vger.kernel.org Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Miao Wang <shankerwangmiao@gmail.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-08-06net: usb: qmi_wwan: add MeiG Smart SRM825LZHANG Yuntian
Add support for MeiG Smart SRM825L which is based on Qualcomm 315 chip. T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=5000 MxCh= 0 D: Ver= 3.20 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 1 P: Vendor=2dee ProdID=4d22 Rev= 4.14 S: Manufacturer=MEIG S: Product=LTE-A Module S: SerialNumber=6f345e48 C:* #Ifs= 6 Cfg#= 1 Atr=80 MxPwr=896mA I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=83(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=82(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option E: Ad=87(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=86(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none) E: Ad=05(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=88(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I:* If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=89(I) Atr=03(Int.) MxPS= 8 Ivl=32ms E: Ad=8e(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=0f(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms Signed-off-by: ZHANG Yuntian <yt@radxa.com> Link: https://patch.msgid.link/D1EB81385E405DFE+20240803074656.567061-1-yt@radxa.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-06net: dsa: microchip: Fix Wake-on-LAN check to not return an errorTristram Ha
The wol variable in ksz_port_set_mac_address() is declared with random data, but the code in ksz_get_wol call may not be executed so the WAKE_MAGIC check may be invalid resulting in an error message when setting a MAC address after starting the DSA driver. Fixes: 3b454b6390c3 ("net: dsa: microchip: ksz9477: Add Wake on Magic Packet support") Signed-off-by: Tristram Ha <tristram.ha@microchip.com> Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20240805235200.24982-1-Tristram.Ha@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-06net: linkwatch: use system_unbound_wqEric Dumazet
linkwatch_event() grabs possibly very contended RTNL mutex. system_wq is not suitable for such work. Inspired by many noisy syzbot reports. 3 locks held by kworker/0:7/5266: #0: ffff888015480948 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3206 [inline] #0: ffff888015480948 ((wq_completion)events){+.+.}-{0:0}, at: process_scheduled_works+0x90a/0x1830 kernel/workqueue.c:3312 #1: ffffc90003f6fd00 ((linkwatch_work).work){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3207 [inline] , at: process_scheduled_works+0x945/0x1830 kernel/workqueue.c:3312 #2: ffffffff8fa6f208 (rtnl_mutex){+.+.}-{3:3}, at: linkwatch_event+0xe/0x60 net/core/link_watch.c:276 Reported-by: syzbot <syzkaller@googlegroups.com> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20240805085821.1616528-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-06Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio fix from Michael Tsirkin: "Fix a single, long-standing issue with kick pass-through vdpa" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost-vdpa: switch to use vmf_insert_pfn() in the fault handler
2024-08-06Merge tag 'platform-drivers-x86-v6.11-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Ilpo Järvinen: "Fixes: - Fix ACPI notifier racing with itself (intel-vbtn) - Initialize local variable to cover a timeout corner case (intel/ifs) - WMI docs spelling New device IDs: - amd/{pmc,pmf}: AMD 1Ah model 60h series. - amd/pmf: SPS quirk support for ASUS ROG Ally X" * tag 'platform-drivers-x86-v6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86/intel/ifs: Initialize union ifs_status to zero platform/x86: msi-wmi-platform: Fix spelling mistakes platform/x86/amd/pmf: Add new ACPI ID AMDI0107 platform/x86/amd/pmc: Send OS_HINT command for new AMD platform platform/x86/amd: pmf: Add quirk for ROG Ally X platform/x86: intel-vbtn: Protect ACPI notify handler against recursion
2024-08-05net: bridge: mcast: wait for previous gc cycles when removing portNikolay Aleksandrov
syzbot hit a use-after-free[1] which is caused because the bridge doesn't make sure that all previous garbage has been collected when removing a port. What happens is: CPU 1 CPU 2 start gc cycle remove port acquire gc lock first wait for lock call br_multicasg_gc() directly acquire lock now but free port the port can be freed while grp timers still running Make sure all previous gc cycles have finished by using flush_work before freeing the port. [1] BUG: KASAN: slab-use-after-free in br_multicast_port_group_expired+0x4c0/0x550 net/bridge/br_multicast.c:861 Read of size 8 at addr ffff888071d6d000 by task syz.5.1232/9699 CPU: 1 PID: 9699 Comm: syz.5.1232 Not tainted 6.10.0-rc5-syzkaller-00021-g24ca36a562d6 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:114 print_address_description mm/kasan/report.c:377 [inline] print_report+0xc3/0x620 mm/kasan/report.c:488 kasan_report+0xd9/0x110 mm/kasan/report.c:601 br_multicast_port_group_expired+0x4c0/0x550 net/bridge/br_multicast.c:861 call_timer_fn+0x1a3/0x610 kernel/time/timer.c:1792 expire_timers kernel/time/timer.c:1843 [inline] __run_timers+0x74b/0xaf0 kernel/time/timer.c:2417 __run_timer_base kernel/time/timer.c:2428 [inline] __run_timer_base kernel/time/timer.c:2421 [inline] run_timer_base+0x111/0x190 kernel/time/timer.c:2437 Reported-by: syzbot+263426984509be19c9a0@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=263426984509be19c9a0 Fixes: e12cec65b554 ("net: bridge: mcast: destroy all entries via gc") Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20240802080730.3206303-1-razor@blackwall.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-05Merge tag 'linux_kselftest-fixes-6.11-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest fix from Shuah Khan: "A single fix to the conditional in ksft.py script which incorrectly flags a test suite failed when there are skipped tests in the mix. The logic is fixed to take skipped tests into account and report the test as passed" * tag 'linux_kselftest-fixes-6.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests: ksft: Fix finished() helper exit code on skipped tests
2024-08-05Merge tag 'slab-fixes-for-6.11-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab fix from Vlastimil Babka: "Since v6.8 we've had a subtle breakage in SLUB with KFENCE enabled, that can cause a crash. It hasn't been found earlier due to quite specific conditions necessary (OOM during kmem_cache_alloc_bulk())" * tag 'slab-fixes-for-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: mm, slub: do not call do_slab_free for kfence object
2024-08-05net: usb: qmi_wwan: fix memory leak for not ip packetsDaniele Palmas
Free the unused skb when not ip packets arrive. Fixes: c6adf77953bc ("net: usb: qmi_wwan: add qmap mux protocol support") Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-05Merge branch 'virtio-net-rq-coalescing' into mainDavid S. Miller
Heng Qi says: ==================== virtio-net: unbreak vq resizing if vq coalescing is not supported Currently, if the driver does not negotiate the vq coalescing feature but supports vq resize, the vq resize action, which could have been successfully executed, is interrupted due to the failure in configuring the vq coalescing parameters. This issue needs to be fixed. Changelog ========= v3->v4: - Add a comment for patch[2/2]. v2->v3: - Break out the feature check and the fix into separate patches. v1->v2: - Rephrase the subject. - Put the feature check inside the virtnet_send_{r,t}x_ctrl_coal_vq_cmd. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-05virtio-net: unbreak vq resizing when coalescing is not negotiatedHeng Qi
Don't break the resize action if the vq coalescing feature named VIRTIO_NET_F_VQ_NOTF_COAL is not negotiated. Fixes: f61fe5f081cf ("virtio-net: fix the vq coalescing setting for vq resize") Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Eugenio Pé rez <eperezma@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-05virtio-net: check feature before configuring the vq coalescing commandHeng Qi
Virtio spec says: The driver MUST have negotiated the VIRTIO_NET_F_VQ_NOTF_COAL feature when issuing commands VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET and VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET. So we add the feature negotiation check to virtnet_send_{r,t}x_ctrl_coal_vq_cmd as a basis for the next bugfix patch. Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-05wifi: ath12k: use 128 bytes aligned iova in transmit path for WCN7850Baochen Qiang
In transmit path, it is likely that the iova is not aligned to PCIe TLP max payload size, which is 128 for WCN7850. Normally in such cases hardware is expected to split the packet into several parts in a manner such that they, other than the first one, have aligned iova. However due to hardware limitations, WCN7850 does not behave like that properly with some specific unaligned iova in transmit path. This easily results in target hang in a KPI transmit test: packet send/receive failure, WMI command send timeout etc. Also fatal error seen in PCIe level: ... Capabilities: ... ... DevSta: ... FatalErr+ ... ... ... Work around this by manually moving/reallocating payload buffer such that we can map it to a 128 bytes aligned iova. The moving requires sufficient head room or tail room in skb: for the former we can do ourselves a favor by asking some extra bytes when registering with mac80211, while for the latter we can do nothing. Moving/reallocating buffer consumes additional CPU cycles, but the good news is that an aligned iova increases PCIe efficiency. In my tests on some X86 platforms the KPI results are almost consistent. Since this is seen only with WCN7850, add a new hardware parameter to differentiate from others. Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3 Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com> Cc: <stable@vger.kernel.org> Tested-by: Mark Pearson <mpearson-lenovo@squebb.ca> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://patch.msgid.link/20240715023814.20242-1-quic_bqiang@quicinc.com
2024-08-04Linux 6.11-rc2Linus Torvalds
2024-08-04profiling: remove profile=sleep supportTetsuo Handa
The kernel sleep profile is no longer working due to a recursive locking bug introduced by commit 42a20f86dc19 ("sched: Add wrapper for get_wchan() to keep task blocked") Booting with the 'profile=sleep' kernel command line option added or executing # echo -n sleep > /sys/kernel/profiling after boot causes the system to lock up. Lockdep reports kthreadd/3 is trying to acquire lock: ffff93ac82e08d58 (&p->pi_lock){....}-{2:2}, at: get_wchan+0x32/0x70 but task is already holding lock: ffff93ac82e08d58 (&p->pi_lock){....}-{2:2}, at: try_to_wake_up+0x53/0x370 with the call trace being lock_acquire+0xc8/0x2f0 get_wchan+0x32/0x70 __update_stats_enqueue_sleeper+0x151/0x430 enqueue_entity+0x4b0/0x520 enqueue_task_fair+0x92/0x6b0 ttwu_do_activate+0x73/0x140 try_to_wake_up+0x213/0x370 swake_up_locked+0x20/0x50 complete+0x2f/0x40 kthread+0xfb/0x180 However, since nobody noticed this regression for more than two years, let's remove 'profile=sleep' support based on the assumption that nobody needs this functionality. Fixes: 42a20f86dc19 ("sched: Add wrapper for get_wchan() to keep task blocked") Cc: stable@vger.kernel.org # v5.16+ Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-08-04Merge tag 'x86-urgent-2024-08-04' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: - Prevent a deadlock on cpu_hotplug_lock in the aperf/mperf driver. A recent change in the ACPI code which consolidated code pathes moved the invocation of init_freq_invariance_cppc() to be moved to a CPU hotplug handler. The first invocation on AMD CPUs ends up enabling a static branch which dead locks because the static branch enable tries to acquire cpu_hotplug_lock but that lock is already held write by the hotplug machinery. Use static_branch_enable_cpuslocked() instead and take the hotplug lock read for the Intel code path which is invoked from the architecture code outside of the CPU hotplug operations. - Fix the number of reserved bits in the sev_config structure bit field so that the bitfield does not exceed 64 bit. - Add missing Zen5 model numbers - Fix the alignment assumptions of pti_clone_pgtable() and clone_entry_text() on 32-bit: The code assumes PMD aligned code sections, but on 32-bit the kernel entry text is not PMD aligned. So depending on the code size and location, which is configuration and compiler dependent, entry text can cross a PMD boundary. As the start is not PMD aligned adding PMD size to the start address is larger than the end address which results in partially mapped entry code for user space. That causes endless recursion on the first entry from userspace (usually #PF). Cure this by aligning the start address in the addition so it ends up at the next PMD start address. clone_entry_text() enforces PMD mapping, but on 32-bit the tail might eventually be PTE mapped, which causes a map fail because the PMD for the tail is not a large page mapping. Use PTI_LEVEL_KERNEL_IMAGE for the clone() invocation which resolves to PTE on 32-bit and PMD on 64-bit. - Zero the 8-byte case for get_user() on range check failure on 32-bit The recend consolidation of the 8-byte get_user() case broke the zeroing in the failure case again. Establish it by clearing ECX before the range check and not afterwards as that obvioulsy can't be reached when the range check fails * tag 'x86-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/uaccess: Zero the 8-byte get_range case on failure on 32-bit x86/mm: Fix pti_clone_entry_text() for i386 x86/mm: Fix pti_clone_pgtable() alignment assumption x86/setup: Parse the builtin command line before merging x86/CPU/AMD: Add models 0x60-0x6f to the Zen5 range x86/sev: Fix __reserved field in sev_config x86/aperfmperf: Fix deadlock on cpu_hotplug_lock
2024-08-04Merge tag 'timers-urgent-2024-08-04' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "Two fixes for the timer/clocksource code: - The recent fix to make the take over of the broadcast timer more reliable retrieves a per CPU pointer in preemptible context. This went unnoticed in testing as some compilers hoist the access into the non-preemotible section where the pointer is actually used, but obviously compilers can rightfully invoke it where the code put it. Move it into the non-preemptible section right to the actual usage side to cure it. - The clocksource watchdog is supposed to emit a warning when the retry count is greater than one and the number of retries reaches the limit. The condition is backwards and warns always when the count is greater than one. Fixup the condition to prevent spamming dmesg" * tag 'timers-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource: Fix brown-bag boolean thinko in cs_watchdog_read() tick/broadcast: Move per CPU pointer access into the atomic section
2024-08-04Merge tag 'sched-urgent-2024-08-04' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Thomas Gleixner: - When stime is larger than rtime due to accounting imprecision, then utime = rtime - stime becomes negative. As this is unsigned math, the result becomes a huge positive number. Cure it by resetting stime to rtime in that case, so utime becomes 0. - Restore consistent state when sched_cpu_deactivate() fails. When offlining a CPU fails in sched_cpu_deactivate() after the SMT present counter has been decremented, then the function aborts but fails to increment the SMT present counter and leaves it imbalanced. Consecutive operations cause it to underflow. Add the missing fixup for the error path. For SMT accounting the runqueue needs to marked online again in the error exit path to restore consistent state. * tag 'sched-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/core: Fix unbalance set_rq_online/offline() in sched_cpu_deactivate() sched/core: Introduce sched_set_rq_on/offline() helper sched/smt: Fix unbalance sched_smt_present dec/inc sched/smt: Introduce sched_smt_present_inc/dec() helper sched/cputime: Fix mul_u64_u64_div_u64() precision for cputime
2024-08-04Merge tag 'perf-urgent-2024-08-04' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 perf fixes from Thomas Gleixner: - Move the smp_processor_id() invocation back into the non-preemtible region, so that the result is valid to use - Add the missing package C2 residency counters for Sierra Forest CPUs to make the newly added support actually useful * tag 'perf-urgent-2024-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86: Fix smp_processor_id()-in-preemptible warnings perf/x86/intel/cstate: Add pkg C2 residency counter for Sierra Forest