summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-05-30net: mana: Fix perf regression: remove rx_cqes, tx_cqes countersHaiyang Zhang
The apc->eth_stats.rx_cqes is one per NIC (vport), and it's on the frequent and parallel code path of all queues. So, r/w into this single shared variable by many threads on different CPUs creates a lot caching and memory overhead, hence perf regression. And, it's not accurate due to the high volume concurrent r/w. For example, a workload is iperf with 128 threads, and with RPS enabled. We saw perf regression of 25% with the previous patch adding the counters. And this patch eliminates the regression. Since the error path of mana_poll_rx_cq() already has warnings, so keeping the counter and convert it to a per-queue variable is not necessary. So, just remove this counter from this high frequency code path. Also, remove the tx_cqes counter for the same reason. We have warnings & other counters for errors on that path, and don't need to count every normal cqe processing. Cc: stable@vger.kernel.org Fixes: bd7fc6e1957c ("net: mana: Add new MANA VF performance counters for easier troubleshooting") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/1685115537-31675-1-git-send-email-haiyangz@microsoft.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30Merge branch 'two-fixes-for-smcrv2'Paolo Abeni
Wen Gu says: ==================== Two fixes for SMCRv2 This patch set includes two bugfix for SMCRv2. ==================== Link: https://lore.kernel.org/r/1685101741-74826-1-git-send-email-guwen@linux.alibaba.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30net/smc: Don't use RMBs not mapped to new link in SMCRv2 ADD LINKWen Gu
We encountered a crash when using SMCRv2. It is caused by a logical error in smc_llc_fill_ext_v2(). BUG: kernel NULL pointer dereference, address: 0000000000000014 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 7 PID: 453 Comm: kworker/7:4 Kdump: loaded Tainted: G W E 6.4.0-rc3+ #44 Workqueue: events smc_llc_add_link_work [smc] RIP: 0010:smc_llc_fill_ext_v2+0x117/0x280 [smc] RSP: 0018:ffffacb5c064bd88 EFLAGS: 00010282 RAX: ffff9a6bc1c3c02c RBX: ffff9a6be3558000 RCX: 0000000000000000 RDX: 0000000000000002 RSI: 0000000000000002 RDI: 000000000000000a RBP: ffffacb5c064bdb8 R08: 0000000000000040 R09: 000000000000000c R10: ffff9a6bc0910300 R11: 0000000000000002 R12: 0000000000000000 R13: 0000000000000002 R14: ffff9a6bc1c3c02c R15: ffff9a6be3558250 FS: 0000000000000000(0000) GS:ffff9a6eefdc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000014 CR3: 000000010b078003 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> smc_llc_send_add_link+0x1ae/0x2f0 [smc] smc_llc_srv_add_link+0x2c9/0x5a0 [smc] ? cc_mkenc+0x40/0x60 smc_llc_add_link_work+0xb8/0x140 [smc] process_one_work+0x1e5/0x3f0 worker_thread+0x4d/0x2f0 ? __pfx_worker_thread+0x10/0x10 kthread+0xe5/0x120 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2c/0x50 </TASK> When an alernate RNIC is available in system, SMC will try to add a new link based on the RNIC for resilience. All the RMBs in use will be mapped to the new link. Then the RMBs' MRs corresponding to the new link will be filled into SMCRv2 LLC ADD LINK messages. However, smc_llc_fill_ext_v2() mistakenly accesses to unused RMBs which haven't been mapped to the new link and have no valid MRs, thus causing a crash. So this patch fixes the logic. Fixes: b4ba4652b3f8 ("net/smc: extend LLC layer for SMC-Rv2") Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30net/smc: Scan from current RMB list when no position specifiedWen Gu
When finding the first RMB of link group, it should start from the current RMB list whose index is 0. So fix it. Fixes: b4ba4652b3f8 ("net/smc: extend LLC layer for SMC-Rv2") Signed-off-by: Wen Gu <guwen@linux.alibaba.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-30rxrpc: Truncate UTS_RELEASE for rxrpc versionDavid Howells
UTS_RELEASE has a maximum length of 64 which can cause rxrpc_version to exceed the 65 byte message limit. Per the rx spec[1]: "If a server receives a packet with a type value of 13, and the client-initiated flag set, it should respond with a 65-byte payload containing a string that identifies the version of AFS software it is running." The current implementation causes a compile error when WERROR is turned on and/or UTS_RELEASE exceeds the length of 49 (making the version string more than 64 characters). Fix this by generating the string during module initialisation and limiting the UTS_RELEASE segment of the string does not exceed 49 chars. We need to make sure that the 64 bytes includes "linux-" at the front and " AF_RXRPC" at the back as this may be used in pattern matching. Fixes: 44ba06987c0b ("RxRPC: Handle VERSION Rx protocol packets") Reported-by: Kenny Ho <Kenny.Ho@amd.com> Link: https://lore.kernel.org/r/20230523223944.691076-1-Kenny.Ho@amd.com/ Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Kenny Ho <Kenny.Ho@amd.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: Andrew Lunn <andrew@lunn.ch> cc: David Laight <David.Laight@ACULAB.COM> cc: "David S. Miller" <davem@davemloft.net> cc: Eric Dumazet <edumazet@google.com> cc: Jakub Kicinski <kuba@kernel.org> cc: Paolo Abeni <pabeni@redhat.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org Link: https://web.mit.edu/kolya/afs/rx/rx-spec [1] Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> Link: https://lore.kernel.org/r/654974.1685100894@warthog.procyon.org.uk Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-29tcp: Return user_mss for TCP_MAXSEG in CLOSE/LISTEN state if user_mss setCambda Zhu
This patch replaces the tp->mss_cache check in getting TCP_MAXSEG with tp->rx_opt.user_mss check for CLOSE/LISTEN sock. Since tp->mss_cache is initialized with TCP_MSS_DEFAULT, checking if it's zero is probably a bug. With this change, getting TCP_MAXSEG before connecting will return default MSS normally, and return user_mss if user_mss is set. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Jack Yang <mingliang@linux.alibaba.com> Suggested-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/netdev/CANn89i+3kL9pYtkxkwxwNMzvC_w3LNUum_2=3u+UyLBmGmifHA@mail.gmail.com/#t Signed-off-by: Cambda Zhu <cambda@linux.alibaba.com> Link: https://lore.kernel.org/netdev/14D45862-36EA-4076-974C-EA67513C92F6@linux.alibaba.com/ Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230527040317.68247-1-cambda@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-29tcp: deny tcp_disconnect() when threads are waitingEric Dumazet
Historically connect(AF_UNSPEC) has been abused by syzkaller and other fuzzers to trigger various bugs. A recent one triggers a divide-by-zero [1], and Paolo Abeni was able to diagnose the issue. tcp_recvmsg_locked() has tests about sk_state being not TCP_LISTEN and TCP REPAIR mode being not used. Then later if socket lock is released in sk_wait_data(), another thread can call connect(AF_UNSPEC), then make this socket a TCP listener. When recvmsg() is resumed, it can eventually call tcp_cleanup_rbuf() and attempt a divide by 0 in tcp_rcv_space_adjust() [1] This patch adds a new socket field, counting number of threads blocked in sk_wait_event() and inet_wait_for_connect(). If this counter is not zero, tcp_disconnect() returns an error. This patch adds code in blocking socket system calls, thus should not hurt performance of non blocking ones. Note that we probably could revert commit 499350a5a6e7 ("tcp: initialize rcv_mss to TCP_MIN_MSS instead of 0") to restore original tcpi_rcv_mss meaning (was 0 if no payload was ever received on a socket) [1] divide error: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 13832 Comm: syz-executor.5 Not tainted 6.3.0-rc4-syzkaller-00224-g00c7b5f4ddc5 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/02/2023 RIP: 0010:tcp_rcv_space_adjust+0x36e/0x9d0 net/ipv4/tcp_input.c:740 Code: 00 00 00 00 fc ff df 4c 89 64 24 48 8b 44 24 04 44 89 f9 41 81 c7 80 03 00 00 c1 e1 04 44 29 f0 48 63 c9 48 01 e9 48 0f af c1 <49> f7 f6 48 8d 04 41 48 89 44 24 40 48 8b 44 24 30 48 c1 e8 03 48 RSP: 0018:ffffc900033af660 EFLAGS: 00010206 RAX: 4a66b76cbade2c48 RBX: ffff888076640cc0 RCX: 00000000c334e4ac RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000001 RBP: 00000000c324e86c R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8880766417f8 R13: ffff888028fbb980 R14: 0000000000000000 R15: 0000000000010344 FS: 00007f5bffbfe700(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b32f25000 CR3: 000000007ced0000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> tcp_recvmsg_locked+0x100e/0x22e0 net/ipv4/tcp.c:2616 tcp_recvmsg+0x117/0x620 net/ipv4/tcp.c:2681 inet6_recvmsg+0x114/0x640 net/ipv6/af_inet6.c:670 sock_recvmsg_nosec net/socket.c:1017 [inline] sock_recvmsg+0xe2/0x160 net/socket.c:1038 ____sys_recvmsg+0x210/0x5a0 net/socket.c:2720 ___sys_recvmsg+0xf2/0x180 net/socket.c:2762 do_recvmmsg+0x25e/0x6e0 net/socket.c:2856 __sys_recvmmsg net/socket.c:2935 [inline] __do_sys_recvmmsg net/socket.c:2958 [inline] __se_sys_recvmmsg net/socket.c:2951 [inline] __x64_sys_recvmmsg+0x20f/0x260 net/socket.c:2951 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f5c0108c0f9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f5bffbfe168 EFLAGS: 00000246 ORIG_RAX: 000000000000012b RAX: ffffffffffffffda RBX: 00007f5c011ac050 RCX: 00007f5c0108c0f9 RDX: 0000000000000001 RSI: 0000000020000bc0 RDI: 0000000000000003 RBP: 00007f5c010e7b39 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000122 R11: 0000000000000246 R12: 0000000000000000 R13: 00007f5c012cfb1f R14: 00007f5bffbfe300 R15: 0000000000022000 </TASK> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot <syzkaller@googlegroups.com> Reported-by: Paolo Abeni <pabeni@redhat.com> Diagnosed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20230526163458.2880232-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-29af_packet: do not use READ_ONCE() in packet_bind()Eric Dumazet
A recent patch added READ_ONCE() in packet_bind() and packet_bind_spkt() This is better handled by reading pkt_sk(sk)->num later in packet_do_bind() while appropriate lock is held. READ_ONCE() in writers are often an evidence of something being wrong. Fixes: 822b5a1c17df ("af_packet: Fix data-races of pkt_sk(sk)->num.") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230526154342.2533026-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-29netlink: specs: correct types of legacy arraysJakub Kicinski
ethtool has some attrs which dump multiple scalars into an attribute. The spec currently expects one attr per entry. Fixes: a353318ebf24 ("tools: ynl: populate most of the ethtool spec") Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20230526220653.65538-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-29net: usb: qmi_wwan: Set DTR quirk for BroadMobi BM818Sebastian Krzyszkowiak
BM818 is based on Qualcomm MDM9607 chipset. Fixes: 9a07406b00cd ("net: usb: qmi_wwan: Add the BroadMobi BM818 card") Cc: stable@vger.kernel.org Signed-off-by: Sebastian Krzyszkowiak <sebastian.krzyszkowiak@puri.sm> Acked-by: Bjørn Mork <bjorn@mork.no> Link: https://lore.kernel.org/r/20230526-bm818-dtr-v1-1-64bbfa6ba8af@puri.sm Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-30ata: libata-scsi: Use correct device no in ata_find_dev()Damien Le Moal
For devices not attached to a port multiplier and managed directly by libata, the device number passed to ata_find_dev() must always be lower than the maximum number of devices returned by ata_link_max_devices(). That is 1 for SATA devices or 2 for an IDE link with master+slave devices. This device number is the SCSI device ID which matches these constraints as the IDs are generated per port and so never exceed the maximum number of devices for the link being used. However, for libsas managed devices, SCSI device IDs are assigned per struct scsi_host, leading to device IDs for SATA devices that can be well in excess of libata per-link maximum number of devices. This results in ata_find_dev() to always return NULL for libsas managed devices except for the first device of the target scsi_host with ID (device number) equal to 0. This issue is visible by executing the hdparm utility, which fails. E.g.: hdparm -i /dev/sdX /dev/sdX: HDIO_GET_IDENTITY failed: No message of desired type Fix this by rewriting ata_find_dev() to ignore the device number for non-PMP attached devices with a link with at most 1 device, that is SATA devices. For these, the device number 0 is always used to return the correct pointer to the struct ata_device of the port link. This change excludes IDE master/slave setups (maximum number of devices per link is 2) and port-multiplier attached devices. Also, to be consistant with the fact that SCSI device IDs and channel numbers used as device numbers are both unsigned int, change the devno argument of ata_find_dev() to unsigned int. Reported-by: Xingui Yang <yangxingui@huawei.com> Fixes: 41bda9c98035 ("libata-link: update hotplug to handle PMP links") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Jason Yan <yanaijie@huawei.com>
2023-05-29RDMA/irdma: Fix Local Invalidate fencingMustafa Ismail
If the local invalidate fence is indicated in the WR, only the read fence is currently being set in WQE. Fix this to set both the read and local fence in the WQE. Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Link: https://lore.kernel.org/r/20230522155654.1309-4-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-05-29RDMA/irdma: Prevent QP use after freeMustafa Ismail
There is a window where the poll cq may use a QP that has been freed. This can happen if a CQE is polled before irdma_clean_cqes() can clear the CQE's related to the QP and the destroy QP races to free the QP memory. then the QP structures are used in irdma_poll_cq. Fix this by moving the clearing of CQE's before the reference is removed and the QP is destroyed. Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Link: https://lore.kernel.org/r/20230522155654.1309-3-shiraz.saleem@intel.com Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-05-29MAINTAINERS: Update maintainer of Amazon EFA driverMichael Margolin
Change EFA driver maintainer from Gal Pressman to myself. Keep Gal as a reviewer at his request. Link: https://lore.kernel.org/r/20230525094444.12570-1-mrgolin@amazon.com Signed-off-by: Michael Margolin <mrgolin@amazon.com> Acked-by: Gal Pressman <gal.pressman@linux.dev> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-05-29Merge tag 'trace-v6.4-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: "User events: - Use long instead of int for storing the enable set/clear bit, as it was found that big endian machines could end up using the wrong bits. - Split allocating mm and attaching it. This keeps the allocation separate from the registration and avoids various races. - Remove RCU locking around pin_user_pages_remote() as that can schedule. The RCU protection is no longer needed with the above split of mm allocation and attaching. - Rename the "link" fields of the various structs to something more meaningful. - Add comments around user_event_mm struct usage and locking requirements. Timerlat tracer: - Fix missed wakeup of timerlat thread caused by the timerlat interrupt triggering when tracing is off. The timer interrupt handler needs to always wake up the timerlat thread regardless if tracing is enabled or not, otherwise, it will never wake up. Histograms: - Fix regression of breaking the "stacktrace" modifier for variables. That modifier cannot be used for values, but can be used for variables that are passed from one histogram to the next. This was broken when adding the restriction to values as the variable logic used the same code. - Rename the special field "stacktrace" to "common_stacktrace". Special fields (that are not actually part of the event, but can act just like event fields, like 'comm' and 'timestamp') should be prefixed with 'common_' for consistency. To keep backward compatibility, 'stacktrace' can still be used (as with the special field 'cpu'), but can be overridden if the event has a field called 'stacktrace'. - Update the synthetic event selftests to use the new name (synthetic events are created by histograms) Tracing bootup selftests: - Reorganize the code to keep artifacts of the selftests not compiled in when selftests are not configured. - Add various cond_resched() around the selftest code, as the softlock watchdog was triggering much more often. It appears that the kernel runs slower now with full debugging enabled. - While debugging ftrace with ftrace (using an instance ring buffer instead of the top level one), I found that the selftests were disabling prints to the debug instance. This should not happen, as the selftests only disable printing to the main buffer as the selftests examine the main buffer to see if it has what it expects, and prints can make the tests fail. Make the selftests only disable printing to the toplevel buffer, and leave the instance buffers alone" * tag 'trace-v6.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Have function_graph selftest call cond_resched() tracing: Only make selftest conditionals affect the global_trace tracing: Make tracing_selftest_running/delete nops when not used tracing: Have tracer selftests call cond_resched() before running tracing: Move setting of tracing_selftest_running out of register_tracer() tracing/selftests: Update synthetic event selftest to use common_stacktrace tracing: Rename stacktrace field to common_stacktrace tracing/histograms: Allow variables to have some modifiers tracing/user_events: Document user_event_mm one-shot list usage tracing/user_events: Rename link fields for clarity tracing/user_events: Remove RCU lock while pinning pages tracing/user_events: Split up mm alloc and attach tracing/timerlat: Always wakeup the timerlat thread tracing/user_events: Use long vs int for atomic bit ops
2023-05-29Merge tag 'v6.4-p3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "Fix an alignment crash in x86/aria" * tag 'v6.4-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: x86/aria - Use 16 byte alignment for GFNI constant vectors
2023-05-29Revert "module: error out early on concurrent load of the same module file"Linus Torvalds
This reverts commit 9828ed3f695a138f7add89fa2a186ababceb8006. Sadly, it does seem to cause failures to load modules. Johan Hovold reports: "This change breaks module loading during boot on the Lenovo Thinkpad X13s (aarch64). Specifically it results in indefinite probe deferral of the display and USB (ethernet) which makes it a pain to debug. Typing in the dark to acquire some logs reveals that other modules are missing as well" Since this was applied late as a "let's try this", I'm reverting it asap, and we can try to figure out what goes wrong later. The excessive parallel module loading problem is annoying, but not noticeable in normal situations, and this was only meant as an optimistic workaround for a user-space bug. One possible solution may be to do the optimistic exclusive open first, and then use a lock to serialize loading if that fails. Reported-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/lkml/ZHRpH-JXAxA6DnzR@hovoldconsulting.com/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-05-28tracing: Have function_graph selftest call cond_resched()Steven Rostedt (Google)
When all kernel debugging is enabled (lockdep, KSAN, etc), the function graph enabling and disabling can take several seconds to complete. The function_graph selftest enables and disables function graph tracing several times. With full debugging enabled, the soft lockup watchdog was triggering because the selftest was running without ever scheduling. Add cond_resched() throughout the test to make sure it does not trigger the soft lockup detector. Link: https://lkml.kernel.org/r/20230528051742.1325503-6-rostedt@goodmis.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-05-28tracing: Only make selftest conditionals affect the global_traceSteven Rostedt (Google)
The tracing_selftest_running and tracing_selftest_disabled variables were to keep trace_printk() and other writes from affecting the tracing selftests, as the tracing selftests would examine the ring buffer to see if it contained what it expected or not. trace_printk() and friends could add to the ring buffer and cause the selftests to fail (and then disable the tracer that was being tested). To keep that from happening, these variables were added and would keep trace_printk() and friends from writing to the ring buffer while the tests were going on. But this was only the top level ring buffer (owned by the global_trace instance). There is no reason to prevent writing into ring buffers of other instances via the trace_array_printk() and friends. For the functions that could be used by other instances, check if the global_trace is the tracer instance that is being written to before deciding to not allow the write. Link: https://lkml.kernel.org/r/20230528051742.1325503-5-rostedt@goodmis.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-05-28tracing: Make tracing_selftest_running/delete nops when not usedSteven Rostedt (Google)
There's no reason to test the condition variables tracing_selftest_running or tracing_selftest_delete when tracing selftests are not enabled. Make them define 0s when not the selftests are not configured in. Link: https://lkml.kernel.org/r/20230528051742.1325503-4-rostedt@goodmis.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-05-28tracing: Have tracer selftests call cond_resched() before runningSteven Rostedt (Google)
As there are more and more internal selftests being added to the Linux kernel (KSAN, lockdep, etc) the selftests are taking longer to run when these are enabled. Add a cond_resched() to the calling of do_run_tracer_selftest() to force a schedule if NEED_RESCHED is set, otherwise the soft lockup watchdog may trigger on boot up. Link: https://lkml.kernel.org/r/20230528051742.1325503-3-rostedt@goodmis.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-05-28tracing: Move setting of tracing_selftest_running out of register_tracer()Steven Rostedt (Google)
The variables tracing_selftest_running and tracing_selftest_disabled are only used for when CONFIG_FTRACE_STARTUP_TEST is enabled. Make them only visible within the selftest code. The setting of those variables are in the register_tracer() call, and set in a location where they do not need to be. Create a wrapper around run_tracer_selftest() called do_run_tracer_selftest() which sets those variables, and have register_tracer() call that instead. Having those variables only set within the CONFIG_FTRACE_STARTUP_TEST scope gets rid of them (and also the ability to remove testing against them) when the startup tests are not enabled (most cases). Link: https://lkml.kernel.org/r/20230528051742.1325503-2-rostedt@goodmis.org Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-05-28Merge tag 'phy-fixes-6.4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy Pull phy fixes from Vinod Koul: - init count imbalance fix in qcom-qmp-pcie and combo drivers - kernel doc header fix for qcom-snps driver - mediatek floating point comparison fix - amlogic fix register value * tag 'phy-fixes-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: phy: qcom-snps: correct struct qcom_snps_hsphy kerneldoc phy: amlogic: phy-meson-g12a-mipi-dphy-analog: fix CNTL2_DIF_TX_CTL0 value phy: mediatek: rework the floating point comparisons to fixed point phy: qcom-qmp-pcie-msm8996: fix init-count imbalance phy: qcom-qmp-combo: fix init-count imbalance
2023-05-28Merge tag 'dmaengine-fix-6.4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine fixes from Vinod Koul: "Driver fixes for the at-hdmac, pl330, TI and IDXD drivers: - AT HDMAC driver fixes for Flow Controller bitfield, peripheral ID handling and potential NULL dereference check - PL330 function rename to avoid conflicts - build warning fix for pm function in TI driver - IDXD driver fix for passing freed memory" * tag 'dmaengine-fix-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: dmaengine: at_hdmac: Extend the Flow Controller bitfield to three bits dmaengine: at_hdmac: Repair bitfield macros for peripheral ID handling dmaengine: pl330: rename _start to prevent build error dmaengine: at_xdmac: fix potential Oops in at_xdmac_prep_interleaved() dmaengine: ti: k3-udma: annotate pm function with __maybe_unused dmaengine: idxd: Fix passing freed memory in idxd_cdev_open()
2023-05-28ext4: add EA_INODE checking to ext4_iget()Theodore Ts'o
Add a new flag, EXT4_IGET_EA_INODE which indicates whether the inode is expected to have the EA_INODE flag or not. If the flag is not set/clear as expected, then fail the iget() operation and mark the file system as corrupted. This commit also makes the ext4_iget() always perform the is_bad_inode() check even when the inode is already inode cache. This allows us to remove the is_bad_inode() check from the callers of ext4_iget() in the ea_inode code. Reported-by: syzbot+cbb68193bdb95af4340a@syzkaller.appspotmail.com Reported-by: syzbot+62120febbd1ee3c3c860@syzkaller.appspotmail.com Reported-by: syzbot+edce54daffee36421b4c@syzkaller.appspotmail.com Cc: stable@kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20230524034951.779531-2-tytso@mit.edu Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-05-28Linux 6.4-rc4v6.4-rc4Linus Torvalds
2023-05-28Merge tag 'x86-urgent-2023-05-28' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpu fix from Thomas Gleixner: "A single fix for x86: - Prevent a bogus setting for the number of HT siblings, which is caused by the CPUID evaluation trainwreck of X86. That recomputes the value for each CPU, so the last CPU "wins". That can cause completely bogus sibling values" * tag 'x86-urgent-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/topology: Fix erroneous smp_num_siblings on Intel Hybrid platforms
2023-05-28Merge tag 'perf-urgent-2023-05-28' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "A small set of perf fixes: - Make the MSR-readout based CHA discovery work around broken discovery tables in some SPR firmwares. - Prevent saving PEBS configuration which has software bits set that cause a crash when restored into the relevant MSR" * tag 'perf-urgent-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/uncore: Correct the number of CHAs on SPR perf/x86/intel: Save/restore cpuc->active_pebs_data_cfg when using guest PEBS
2023-05-28Merge tag 'objtool-urgent-2023-05-28' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull unwinder fixes from Thomas Gleixner: "A set of unwinder and tooling fixes: - Ensure that the stack pointer on x86 is aligned again so that the unwinder does not read past the end of the stack - Discard .note.gnu.property section which has a pointlessly different alignment than the other note sections. That confuses tooling of all sorts including readelf, libbpf and pahole" * tag 'objtool-urgent-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/show_trace_log_lvl: Ensure stack pointer is aligned, again vmlinux.lds.h: Discard .note.gnu.property section
2023-05-28Merge tag 'core-debugobjects-2023-05-28' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull debugobjects fixes from Thomas Gleixner: "Two fixes for debugobjects: - Prevent the allocation path from waking up kswapd. That's a long standing issue due to the GFP_ATOMIC allocation flag. As debug objects can be invoked from pretty much any context waking kswapd can end up in arbitrary lock chains versus the waitqueue lock - Correct the explicit lockdep wait-type violation in debug_object_fill_pool()" * tag 'core-debugobjects-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: debugobjects: Don't wake up kswapd from fill_pool() debugobjects,locking: Annotate debug_object_fill_pool() wait type violation
2023-05-28Merge tag 'irq-urgent-2023-05-28' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "A set of fixes for interrupt chip drivers: - Prevent loss of state in the MIPS GIC interrupt controller - Disable pseudo NMIs on Mediatek based Chromebooks as they have firmware issues which cause instantenous chrashes and freezes wen pseudo NMIs are used - Fix the error handling path in the MBIGEN driver and a defined but not used warning in the meson-gpio interrupt chip driver" * tag 'irq-urgent-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/mbigen: Unify the error handling in mbigen_of_create_domain() irqchip/meson-gpio: Mark OF related data as maybe unused irqchip/mips-gic: Use raw spinlock for gic_lock irqchip/mips-gic: Don't touch vl_map if a local interrupt is not routable irqchip/gic-v3: Disable pseudo NMIs on Mediatek devices w/ firmware issues dt-bindings: interrupt-controller: arm,gic-v3: Add quirk for Mediatek SoCs w/ broken FW
2023-05-28Merge tag 'mips-fixes_6.4_1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from Thomas Bogendoerfer: - fixes to get alchemy platform back in shape - fix for initrd detection * tag 'mips-fixes_6.4_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: mips: Move initrd_start check after initrd address sanitisation. MIPS: Alchemy: fix dbdma2 MIPS: Restore Au1300 support MIPS: unhide PATA_PLATFORM
2023-05-27Merge tag 'powerpc-6.4-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fix from Michael Ellerman: - Reinstate ARCH_FORCE_MAX_ORDER ranges to fix various breakage * tag 'powerpc-6.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/mm: Reinstate ARCH_FORCE_MAX_ORDER ranges
2023-05-27Merge tag 'for-linus-6.4-rc4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - a double free fix in the Xen pvcalls backend driver - a fix for a regression causing the MSI related sysfs entries to not being created in Xen PV guests - a fix in the Xen blkfront driver for handling insane input data better * tag 'for-linus-6.4-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/pci/xen: populate MSI sysfs entries xen/pvcalls-back: fix double frees with pvcalls_new_active_socket() xen/blkfront: Only check REQ_FUA for writes
2023-05-27Merge tag 'char-misc-6.4-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc fixes from Greg KH: "Here are some small driver fixes for 6.4-rc4. They are just two different types: - binder fixes and reverts for reported problems and regressions in the binder "driver". - coresight driver fixes for reported problems. All of these have been in linux-next for over a week with no reported problems" * tag 'char-misc-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: binder: fix UAF of alloc->vma in race with munmap() binder: add lockless binder_alloc_(set|get)_vma() Revert "android: binder: stop saving a pointer to the VMA" Revert "binder_alloc: add missing mmap_lock calls when using the VMA" binder: fix UAF caused by faulty buffer cleanup coresight: perf: Release Coresight path when alloc trace id failed coresight: Fix signedness bug in tmc_etr_buf_insert_barrier_packet()
2023-05-27cifs: address unused variable warningSteve French
Fix trivial unused variable warning (when SMB1 support disabled) "ioctl.c:324:17: warning: variable 'caps' set but not used [-Wunused-but-set-variable]" Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202305250056.oZhsJmdD-lkp@intel.com/ Signed-off-by: Steve French <stfrench@microsoft.com>
2023-05-27wifi: mt76: mt7615: fix possible race in mt7615_mac_sta_pollLorenzo Bianconi
Grab sta_poll_lock spinlock in mt7615_mac_sta_poll routine in order to avoid possible races with mt7615_mac_add_txs() or mt7615_mac_fill_rx() removing msta pointer from sta_poll_list. Fixes: a621372a04ac ("mt76: mt7615: rework mt7615_mac_sta_poll for usb code") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/48b23404b759de4f1db2ef85975c72a4aeb1097c.1684938695.git.lorenzo@kernel.org
2023-05-26smb: delete an unnecessary statementDan Carpenter
We don't need to set the list iterators to NULL before a list_for_each_entry() loop because they are assigned inside the macro. Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-05-26ksmbd: call putname after using the last componentNamjae Jeon
last component point filename struct. Currently putname is called after vfs_path_parent_lookup(). And then last component is used for lookup_one_qstr_excl(). name in last component is freed by previous calling putname(). And It cause file lookup failure when testing generic/464 test of xfstest. Fixes: 74d7970febf7 ("ksmbd: fix racy issue from using ->d_parent and ->d_name") Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-05-26ksmbd: fix incorrect AllocationSize set in smb2_get_infoNamjae Jeon
If filesystem support sparse file, ksmbd should return allocated size using ->i_blocks instead of stat->size. This fix generic/694 xfstests. Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-05-26ksmbd: fix UAF issue from opinfo->connNamjae Jeon
If opinfo->conn is another connection and while ksmbd send oplock break request to cient on current connection, The connection for opinfo->conn can be disconnect and conn could be freed. When sending oplock break request, this ksmbd_conn can be used and cause user-after-free issue. When getting opinfo from the list, ksmbd check connection is being released. If it is not released, Increase ->r_count to wait that connection is freed. Cc: stable@vger.kernel.org Reported-by: Per Forlin <per.forlin@axis.com> Tested-by: Per Forlin <per.forlin@axis.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-05-26ksmbd: fix multiple out-of-bounds read during context decodingKuan-Ting Chen
Check the remaining data length before accessing the context structure to ensure that the entire structure is contained within the packet. Additionally, since the context data length `ctxt_len` has already been checked against the total packet length `len_of_ctxts`, update the comparison to use `ctxt_len`. Cc: stable@vger.kernel.org Signed-off-by: Kuan-Ting Chen <h3xrabbit@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-05-26ksmbd: fix slab-out-of-bounds read in smb2_handle_negotiateKuan-Ting Chen
Check request_buf length first to avoid out-of-bounds read by req->DialectCount. [ 3350.990282] BUG: KASAN: slab-out-of-bounds in smb2_handle_negotiate+0x35d7/0x3e60 [ 3350.990282] Read of size 2 at addr ffff88810ad61346 by task kworker/5:0/276 [ 3351.000406] Workqueue: ksmbd-io handle_ksmbd_work [ 3351.003499] Call Trace: [ 3351.006473] <TASK> [ 3351.006473] dump_stack_lvl+0x8d/0xe0 [ 3351.006473] print_report+0xcc/0x620 [ 3351.006473] kasan_report+0x92/0xc0 [ 3351.006473] smb2_handle_negotiate+0x35d7/0x3e60 [ 3351.014760] ksmbd_smb_negotiate_common+0x7a7/0xf00 [ 3351.014760] handle_ksmbd_work+0x3f7/0x12d0 [ 3351.014760] process_one_work+0xa85/0x1780 Cc: stable@vger.kernel.org Signed-off-by: Kuan-Ting Chen <h3xrabbit@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-05-26ksmbd: fix credit count leakageNamjae Jeon
This patch fix the failure from smb2.credits.single_req_credits_granted test. When client send 8192 credit request, ksmbd return 8191 credit granted. ksmbd should give maximum possible credits that must be granted within the range of not exceeding the max credit to client. Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-05-26ksmbd: fix uninitialized pointer read in smb2_create_link()Namjae Jeon
There is a case that file_present is true and path is uninitialized. This patch change file_present is set to false by default and set to true when patch is initialized. Fixes: 74d7970febf7 ("ksmbd: fix racy issue from using ->d_parent and ->d_name") Reported-by: Coverity Scan <scan-admin@coverity.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-05-26ksmbd: fix uninitialized pointer read in ksmbd_vfs_rename()Namjae Jeon
Uninitialized rd.delegated_inode can be used in vfs_rename(). Fix this by setting rd.delegated_inode to NULL to avoid the uninitialized read. Fixes: 74d7970febf7 ("ksmbd: fix racy issue from using ->d_parent and ->d_name") Reported-by: Coverity Scan <scan-admin@coverity.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-05-26Merge tag 'cxl-fixes-6.4-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull compute express link fixes from Dan Williams: "The 'media ready' series prevents the driver from acting on bad capacity information, and it moves some checks earlier in the init sequence which impacts topics in the queue for 6.5. Additional hotplug testing uncovered a missing enable for memory decode. A debug crash fix is also included. Summary: - Stop trusting capacity data before the "media ready" indication - Add missing HDM decoder capability enable for the cold-plug case - Fix a debug message induced crash" * tag 'cxl-fixes-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl: Explicitly initialize resources when media is not ready cxl/port: Fix NULL pointer access in devm_cxl_add_port() cxl: Move cxl_await_media_ready() to before capacity info retrieval cxl: Wait Memory_Info_Valid before access memory related info cxl/port: Enable the HDM decoder capability for switch ports
2023-05-26Merge tag 'arm-fixes-6.4-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "There have not been a lot of fixes for for the soc tree in 6.4, but these have been sitting here for too long. For the devicetree side, there is one minor warning fix for vexpress, the rest all all for the the NXP i.MX platforms: SoC specific bugfixes for the iMX8 clocks and its USB-3.0 gadget device, as well as board specific fixes for regulators and the phy on some of the i.MX boards. The microchip risc-v and arm32 maintainers now also add a shared maintainer file entry for the arm64 parts. The remaining fixes are all for firmware drivers, addressing mistakes in the optee, scmi and ff-a firmware driver implementation, mostly in the error handling code, incorrect use of the alloc_workqueue() interface in SCMI, and compatibility with corner cases of the firmware implementation" * tag 'arm-fixes-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: MAINTAINERS: update arm64 Microchip entries arm64: dts: imx8: fix USB 3.0 Gadget Failure in QM & QXPB0 at super speed dt-binding: cdns,usb3: Fix cdns,on-chip-buff-size type arm64: dts: colibri-imx8x: delete adc1 and dsp arm64: dts: colibri-imx8x: fix iris pinctrl configuration arm64: dts: colibri-imx8x: move pinctrl property from SoM to eval board arm64: dts: colibri-imx8x: fix eval board pin configuration arm64: dts: imx8mp: Fix video clock parents ARM: dts: imx6qdl-mba6: Add missing pvcie-supply regulator ARM: dts: imx6ull-dhcor: Set and limit the mode for PMIC buck 1, 2 and 3 arm64: dts: imx8mn-var-som: fix PHY detection bug by adding deassert delay arm64: dts: imx8mn: Fix video clock parents firmware: arm_ffa: Set reserved/MBZ fields to zero in the memory descriptors firmware: arm_ffa: Fix FFA device names for logical partitions firmware: arm_ffa: Fix usage of partition info get count flag firmware: arm_ffa: Check if ffa_driver remove is present before executing arm64: dts: arm: add missing cache properties ARM: dts: vexpress: add missing cache properties firmware: arm_scmi: Fix incorrect alloc_workqueue() invocation optee: fix uninited async notif value
2023-05-26Merge tag 'pci-v6.4-fixes-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull PCI fix from Bjorn Helgaas: - Quirk Ice Lake Root Ports to work around DPC log size issue (Mika Westerberg) * tag 'pci-v6.4-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: PCI/DPC: Quirk PIO log size for Intel Ice Lake Root Ports
2023-05-26Merge tag 'vfio-v6.4-rc4' of https://github.com/awilliam/linux-vfioLinus Torvalds
Pull VFIO fix from Alex Williamson: - Test for and return error for invalid pfns through the pin pages interface (Yan Zhao) * tag 'vfio-v6.4-rc4' of https://github.com/awilliam/linux-vfio: vfio/type1: check pfn valid before converting to struct page