summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-06-17Merge tag 'tty-6.4-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull serial driver fixes from Greg KH: "Here are two small serial driver fixes for 6.4-rc7 that resolve some reported problems: - lantiq serial driver irq fix - fsl_lpuart serial driver watermark fix Both of these have been in linux-next this week with no reported issues" * tag 'tty-6.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: tty: serial: fsl_lpuart: reduce RX watermark to 0 on LS1028A serial: lantiq: add missing interrupt ack
2023-06-17SUNRPC: Address RCU warning in net/sunrpc/svc.cChuck Lever
$ make C=1 W=1 net/sunrpc/svc.o make[1]: Entering directory 'linux/obj/manet.1015granger.net' GEN Makefile CALL linux/server-development/scripts/checksyscalls.sh DESCEND objtool INSTALL libsubcmd_headers DESCEND bpf/resolve_btfids INSTALL libsubcmd_headers CC [M] net/sunrpc/svc.o CHECK linux/server-development/net/sunrpc/svc.c linux/server-development/net/sunrpc/svc.c:1225:9: warning: incorrect type in argument 1 (different address spaces) linux/server-development/net/sunrpc/svc.c:1225:9: expected struct spinlock [usertype] *lock linux/server-development/net/sunrpc/svc.c:1225:9: got struct spinlock [noderef] __rcu * linux/server-development/net/sunrpc/svc.c:1227:40: warning: incorrect type in argument 1 (different address spaces) linux/server-development/net/sunrpc/svc.c:1227:40: expected struct spinlock [usertype] *lock linux/server-development/net/sunrpc/svc.c:1227:40: got struct spinlock [noderef] __rcu * make[1]: Leaving directory 'linux/obj/manet.1015granger.net' Warning introduced by commit 913292c97d75 ("sched.h: Annotate sighand_struct with __rcu"). Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-17SUNRPC: Use sysfs_emit in place of strlcpy/sprintfAzeem Shaikh
Part of an effort to remove strlcpy() tree-wide [1]. Direct replacement is safe here since the getter in kernel_params_ops handles -errno return [2]. [1] https://github.com/KSPP/linux/issues/89 [2] https://elixir.bootlin.com/linux/v6.4-rc6/source/include/linux/moduleparam.h#L52 Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-17SUNRPC: Remove transport class dprintk call sitesChuck Lever
Remove a couple of dprintk call sites that are of little value. Reviewed-by: Jeff Layton <jlayton@kernel.org> Acked-by: Tom Talpey <tom@talpey.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-17SUNRPC: Fix comments for transport class registrationChuck Lever
The preceding block comment before svc_register_xprt_class() is not related to that function. While we're here, add proper documenting comments for these two publicly-visible functions. Reviewed-by: Jeff Layton <jlayton@kernel.org> Acked-by: Tom Talpey <tom@talpey.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-17svcrdma: Remove an unused argument from __svc_rdma_put_rw_ctxt()Chuck Lever
Clean up. Reviewed-by: Jeff Layton <jlayton@kernel.org> Acked-by: Tom Talpey <tom@talpey.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-17svcrdma: trace cc_release callsChuck Lever
This event brackets the svcrdma_post_* trace points. If this trace event is enabled but does not appear as expected, that indicates a chunk_ctxt leak. Reviewed-by: Jeff Layton <jlayton@kernel.org> Acked-by: Tom Talpey <tom@talpey.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-17svcrdma: Convert "might sleep" comment into a code annotationChuck Lever
Try to catch incorrect calling contexts mechanically rather than by code review. Reviewed-by: Jeff Layton <jlayton@kernel.org> Acked-by: Tom Talpey <tom@talpey.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-17NFSD: Add an nfsd4_encode_nfstime4() helperChuck Lever
Clean up: de-duplicate some common code. Reviewed-by: Jeff Layton <jlayton@kernel.org> Acked-by: Tom Talpey <tom@talpey.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-17SUNRPC: Move initialization of rq_stimeChuck Lever
Micro-optimization: Call ktime_get() only when ->xpo_recvfrom() has given us a full RPC message to process. rq_stime isn't used otherwise, so this avoids pointless work. Reviewed-by: Jeff Layton <jlayton@kernel.org> Acked-by: Tom Talpey <tom@talpey.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-17SUNRPC: Optimize page release in svc_rdma_sendto()Chuck Lever
Now that we have bulk page allocation and release APIs, it's more efficient to use those than it is for nfsd threads to wait for send completions. Previous patches have eliminated the calls to wait_for_completion() and complete(), in order to avoid scheduler overhead. Now release pages-under-I/O in the send completion handler using the efficient bulk release API. I've measured a 7% reduction in cumulative CPU utilization in svc_rdma_sendto(), svc_rdma_wc_send(), and svc_xprt_release(). In particular, using release_pages() instead of complete() cuts the time per svc_rdma_wc_send() call by two-thirds. This helps improve scalability because svc_rdma_wc_send() is single-threaded per connection. Reviewed-by: Tom Talpey <tom@talpey.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-17svcrdma: Prevent page release when nothing was receivedChuck Lever
I noticed that svc_rqst_release_pages() was still unnecessarily releasing a page when svc_rdma_recvfrom() returns zero. Fixes: a53d5cb0646a ("svcrdma: Avoid releasing a page in svc_xprt_release()") Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-17tcp: enforce receive buffer memory limits by allowing the tcp window to shrinkmfreemon@cloudflare.com
Under certain circumstances, the tcp receive buffer memory limit set by autotuning (sk_rcvbuf) is increased due to incoming data packets as a result of the window not closing when it should be. This can result in the receive buffer growing all the way up to tcp_rmem[2], even for tcp sessions with a low BDP. To reproduce: Connect a TCP session with the receiver doing nothing and the sender sending small packets (an infinite loop of socket send() with 4 bytes of payload with a sleep of 1 ms in between each send()). This will cause the tcp receive buffer to grow all the way up to tcp_rmem[2]. As a result, a host can have individual tcp sessions with receive buffers of size tcp_rmem[2], and the host itself can reach tcp_mem limits, causing the host to go into tcp memory pressure mode. The fundamental issue is the relationship between the granularity of the window scaling factor and the number of byte ACKed back to the sender. This problem has previously been identified in RFC 7323, appendix F [1]. The Linux kernel currently adheres to never shrinking the window. In addition to the overallocation of memory mentioned above, the current behavior is functionally incorrect, because once tcp_rmem[2] is reached when no remediations remain (i.e. tcp collapse fails to free up any more memory and there are no packets to prune from the out-of-order queue), the receiver will drop in-window packets resulting in retransmissions and an eventual timeout of the tcp session. A receive buffer full condition should instead result in a zero window and an indefinite wait. In practice, this problem is largely hidden for most flows. It is not applicable to mice flows. Elephant flows can send data fast enough to "overrun" the sk_rcvbuf limit (in a single ACK), triggering a zero window. But this problem does show up for other types of flows. Examples are websockets and other type of flows that send small amounts of data spaced apart slightly in time. In these cases, we directly encounter the problem described in [1]. RFC 7323, section 2.4 [2], says there are instances when a retracted window can be offered, and that TCP implementations MUST ensure that they handle a shrinking window, as specified in RFC 1122, section 4.2.2.16 [3]. All prior RFCs on the topic of tcp window management have made clear that sender must accept a shrunk window from the receiver, including RFC 793 [4] and RFC 1323 [5]. This patch implements the functionality to shrink the tcp window when necessary to keep the right edge within the memory limit by autotuning (sk_rcvbuf). This new functionality is enabled with the new sysctl: net.ipv4.tcp_shrink_window Additional information can be found at: https://blog.cloudflare.com/unbounded-memory-usage-by-tcp-for-receive-buffers-and-how-we-fixed-it/ [1] https://www.rfc-editor.org/rfc/rfc7323#appendix-F [2] https://www.rfc-editor.org/rfc/rfc7323#section-2.4 [3] https://www.rfc-editor.org/rfc/rfc1122#page-91 [4] https://www.rfc-editor.org/rfc/rfc793 [5] https://www.rfc-editor.org/rfc/rfc1323 Signed-off-by: Mike Freemon <mfreemon@cloudflare.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-17devlink: report devlink_port_type_warn source devicePetr Oros
devlink_port_type_warn is scheduled for port devlink and warning when the port type is not set. But from this warning it is not easy found out which device (driver) has no devlink port set. [ 3709.975552] Type was not set for devlink port. [ 3709.975579] WARNING: CPU: 1 PID: 13092 at net/devlink/leftover.c:6775 devlink_port_type_warn+0x11/0x20 [ 3709.993967] Modules linked in: openvswitch nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nfnetlink bluetooth rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs vhost_net vhost vhost_iotlb tap tun bridge stp llc qrtr intel_rapl_msr intel_rapl_common i10nm_edac nfit libnvdimm x86_pkg_temp_thermal mlx5_ib intel_powerclamp coretemp dell_wmi ledtrig_audio sparse_keymap ipmi_ssif kvm_intel ib_uverbs rfkill ib_core video kvm iTCO_wdt acpi_ipmi intel_vsec irqbypass ipmi_si iTCO_vendor_support dcdbas ipmi_devintf mei_me ipmi_msghandler rapl mei intel_cstate isst_if_mmio isst_if_mbox_pci dell_smbios intel_uncore isst_if_common i2c_i801 dell_wmi_descriptor wmi_bmof i2c_smbus intel_pch_thermal pcspkr acpi_power_meter xfs libcrc32c sd_mod sg nvme_tcp mgag200 i2c_algo_bit nvme_fabrics drm_shmem_helper drm_kms_helper nvme syscopyarea ahci sysfillrect sysimgblt nvme_core fb_sys_fops crct10dif_pclmul libahci mlx5_core sfc crc32_pclmul nvme_common drm [ 3709.994030] crc32c_intel mtd t10_pi mlxfw libata tg3 mdio megaraid_sas psample ghash_clmulni_intel pci_hyperv_intf wmi dm_multipath sunrpc dm_mirror dm_region_hash dm_log dm_mod be2iscsi bnx2i cnic uio cxgb4i cxgb4 tls libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi fuse [ 3710.108431] CPU: 1 PID: 13092 Comm: kworker/1:1 Kdump: loaded Not tainted 5.14.0-319.el9.x86_64 #1 [ 3710.108435] Hardware name: Dell Inc. PowerEdge R750/0PJ80M, BIOS 1.8.2 09/14/2022 [ 3710.108437] Workqueue: events devlink_port_type_warn [ 3710.108440] RIP: 0010:devlink_port_type_warn+0x11/0x20 [ 3710.108443] Code: 84 76 fe ff ff 48 c7 03 20 0e 1a ad 31 c0 e9 96 fd ff ff 66 0f 1f 44 00 00 0f 1f 44 00 00 48 c7 c7 18 24 4e ad e8 ef 71 62 ff <0f> 0b c3 cc cc cc cc 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 f6 87 [ 3710.108445] RSP: 0018:ff3b6d2e8b3c7e90 EFLAGS: 00010282 [ 3710.108447] RAX: 0000000000000000 RBX: ff366d6580127080 RCX: 0000000000000027 [ 3710.108448] RDX: 0000000000000027 RSI: 00000000ffff86de RDI: ff366d753f41f8c8 [ 3710.108449] RBP: ff366d658ff5a0c0 R08: ff366d753f41f8c0 R09: ff3b6d2e8b3c7e18 [ 3710.108450] R10: 0000000000000001 R11: 0000000000000023 R12: ff366d753f430600 [ 3710.108451] R13: ff366d753f436900 R14: 0000000000000000 R15: ff366d753f436905 [ 3710.108452] FS: 0000000000000000(0000) GS:ff366d753f400000(0000) knlGS:0000000000000000 [ 3710.108453] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3710.108454] CR2: 00007f1c57bc74e0 CR3: 000000111d26a001 CR4: 0000000000773ee0 [ 3710.108456] PKRU: 55555554 [ 3710.108457] Call Trace: [ 3710.108458] <TASK> [ 3710.108459] process_one_work+0x1e2/0x3b0 [ 3710.108466] ? rescuer_thread+0x390/0x390 [ 3710.108468] worker_thread+0x50/0x3a0 [ 3710.108471] ? rescuer_thread+0x390/0x390 [ 3710.108473] kthread+0xdd/0x100 [ 3710.108477] ? kthread_complete_and_exit+0x20/0x20 [ 3710.108479] ret_from_fork+0x1f/0x30 [ 3710.108485] </TASK> [ 3710.108486] ---[ end trace 1b4b23cd0c65d6a0 ]--- After patch: [ 402.473064] ice 0000:41:00.0: Type was not set for devlink port. [ 402.473064] ice 0000:41:00.1: Type was not set for devlink port. Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20230615095447.8259-1-poros@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-17net: mctp: remove redundant RTN_UNICAST checkLin Ma
Current mctp_newroute() contains two exactly same check against rtm->rtm_type static int mctp_newroute(...) { ... if (rtm->rtm_type != RTN_UNICAST) { // (1) NL_SET_ERR_MSG(extack, "rtm_type must be RTN_UNICAST"); return -EINVAL; } ... if (rtm->rtm_type != RTN_UNICAST) // (2) return -EINVAL; ... } This commits removes the (2) check as it is redundant. Signed-off-by: Lin Ma <linma@zju.edu.cn> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Acked-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://lore.kernel.org/r/20230615152240.1749428-1-linma@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-17netlink: specs: fixup openvswitch specs for code generationDonald Hunter
Refine the ovs_* specs to align exactly with the ovs netlink UAPI definitions to enable code generation. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20230615151405.77649-1-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-17net: sched: Remove unused qdisc_l2t()YueHaibing
This is unused since switch to psched_l2t_ns(). Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230615124810.34020-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-17kcm: Fix unnecessary psock unreservation.David Howells
kcm_write_msgs() calls unreserve_psock() to release its hold on the underlying TCP socket if it has run out of things to transmit, but if we have nothing in the write queue on entry (e.g. because someone did a zero-length sendmsg), we don't actually go into the transmission loop and as a consequence don't call reserve_psock(). Fix this by skipping the call to unreserve_psock() if we didn't reserve a psock. Fixes: c31a25e1db48 ("kcm: Send multiple frags in one sendmsg()") Reported-by: syzbot+dd1339599f1840e4cc65@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/000000000000a61ffe05fe0c3d08@google.com/ Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: syzbot+dd1339599f1840e4cc65@syzkaller.appspotmail.com cc: Tom Herbert <tom@herbertland.com> cc: Tom Herbert <tom@quantonium.net> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Link: https://lore.kernel.org/r/20787.1686828722@warthog.procyon.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-17sfc: use budget for TX completionsÍñigo Huguet
When running workloads heavy unbalanced towards TX (high TX, low RX traffic), sfc driver can retain the CPU during too long times. Although in many cases this is not enough to be visible, it can affect performance and system responsiveness. A way to reproduce it is to use a debug kernel and run some parallel netperf TX tests. In some systems, this will lead to this message being logged: kernel:watchdog: BUG: soft lockup - CPU#12 stuck for 22s! The reason is that sfc driver doesn't account any NAPI budget for the TX completion events work. With high-TX/low-RX traffic, this makes that the CPU is held for long time for NAPI poll. Documentations says "drivers can process completions for any number of Tx packets but should only process up to budget number of Rx packets". However, many drivers do limit the amount of TX completions that they process in a single NAPI poll. In the same way, this patch adds a limit for the TX work in sfc. With the patch applied, the watchdog warning never appears. Tested with netperf in different combinations: single process / parallel processes, TCP / UDP and different sizes of UDP messages. Repeated the tests before and after the patch, without any noticeable difference in network or CPU performance. Test hardware: Intel(R) Xeon(R) CPU E5-1620 v4 @ 3.50GHz (4 cores, 2 threads/core) Solarflare Communications XtremeScale X2522-25G Network Adapter Fixes: 5227ecccea2d ("sfc: remove tx and MCDI handling from NAPI budget consideration") Fixes: d19a53721863 ("sfc_ef100: TX path for EF100 NICs") Reported-by: Fei Liu <feliu@redhat.com> Signed-off-by: Íñigo Huguet <ihuguet@redhat.com> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Link: https://lore.kernel.org/r/20230615084929.10506-1-ihuguet@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-17irqchip/jcore-aic: Fix missing allocation of IRQ descriptorsJohn Paul Adrian Glaubitz
The initialization function for the J-Core AIC aic_irq_of_init() is currently missing the call to irq_alloc_descs() which allocates and initializes all the IRQ descriptors. Add missing function call and return the error code from irq_alloc_descs() in case the allocation fails. Fixes: 981b58f66cfc ("irqchip/jcore-aic: Add J-Core AIC driver") Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Tested-by: Rob Landley <rob@landley.net> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230510163343.43090-1-glaubitz@physik.fu-berlin.de
2023-06-17irqchip/stm32-exti: Fix warning on initialized field overwrittenAntonio Borneo
While compiling with W=1, both gcc and clang complain about a tricky way to initialize an array by filling it with a non-zero value and then overrride some of the array elements. In this case the override is intentional, so just disable the specific warning for only this part of the code. Note: the flag "-Woverride-init" is recognized by both compilers, but the warning msg from clang reports "-Winitializer-overrides". The doc of clang clarifies that the two flags are synonyms, so use here only the flag name common on both compilers. Signed-off-by: Antonio Borneo <antonio.borneo@foss.st.com> Fixes: c297493336b7 ("irqchip/stm32-exti: Simplify irq description table") Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230601155614.34490-1-antonio.borneo@foss.st.com
2023-06-17irqchip/stm32-exti: Add STM32MP15xx IWDG2 EXTI to GIC mapMarek Vasut
The EXTI interrupt 46 is mapped to GIC interrupt 151. Add the missing mapping, which is used for IWDG2 pretimeout interrupt and wake up source. Reviewed-by: Antonio Borneo <antonio.borneo@foss.st.com> Signed-off-by: Marek Vasut <marex@denx.de> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230517194349.105745-1-marex@denx.de
2023-06-17irqchip/gicv3: Add a iort_pmsi_get_dev_id() prototypeArnd Bergmann
iort_pmsi_get_dev_id() has a __weak definition in the driver, and an override in arm64 specific code, but the declaration is conditional and not always seen when the copy in the driver gets built: drivers/irqchip/irq-gic-v3-its-platform-msi.c:41:12: error: no previous prototype for 'iort_pmsi_get_dev_id' [-Werror=missing-prototypes] Move the existing declaration out of the #ifdef block to ensure it can be seen in all configurations. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230516200516.554663-5-arnd@kernel.org
2023-06-17irqchip/mxs: Include linux/irqchip/mxs.hArnd Bergmann
This header contains the definition for icoll_handle_irq(), which is used in arch/arm/mach-mxs/mach-mxs.c, without this we get a warning about a missing prototype when building with W=1. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230516200516.554663-4-arnd@kernel.org
2023-06-17irqchip/clps711x: Remove unused clps711x_intc_init() functionArnd Bergmann
This function has no caller or declaration any more: drivers/irqchip/irq-clps711x.c:215:13: error: no previous prototype for 'clps711x_intc_init' The #ifdef check around clps711x_intc_init_dt() is also not needed since the file is only built when that is enabled. Fixes: 4a56f46a7dc6 ("ARM: clps711x: Remove boards support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230516200516.554663-3-arnd@kernel.org
2023-06-17irqchip/mmp: Remove non-DT codepathArnd Bergmann
Building with "W=1" warns about missing declarations for two functions in the mmp irqchip driver: drivers/irqchip/irq-mmp.c:248:13: error: no previous prototype for 'icu_init_irq' drivers/irqchip/irq-mmp.c:271:13: error: no previous prototype for 'mmp2_init_icu' The declarations are present in an unused header, but since there is no caller, it's best to just remove the functions and the header completely, making the driver DT-only to match the state of the platform. Fixes: 77acc85ce797 ("ARM: mmp: remove device definitions") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230516200516.554663-2-arnd@kernel.org
2023-06-17irqchip/ftintc010: Mark all function staticArnd Bergmann
Two functions were always global but never had any callers outside of this file: drivers/irqchip/irq-ftintc010.c:128:39: error: no previous prototype for 'ft010_irqchip_handle_irq' drivers/irqchip/irq-ftintc010.c:165:12: error: no previous prototype for 'ft010_of_init_irq' Fixes: b4d3053c8ce9 ("irqchip: Add a driver for Cortina Gemini") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230516200516.554663-1-arnd@kernel.org
2023-06-17irqdomain: Include internals.h for function prototypesArnd Bergmann
irq_domain_debugfs_init() is defined in irqdomain.c, but the declaration is in a header that is not included here: kernel/irq/irqdomain.c:1965:13: error: no previous prototype for 'irq_domain_debugfs_init' [-Werror=missing-prototypes] Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230516200432.554240-1-arnd@kernel.org
2023-06-17Merge branch irq/loongarch-fixes-6.5 into irq/irqchip-nextMarc Zyngier
* irq/loongarch-fixes-6.5: : . : Yet another series of random fixes for the Loongson/Loongarch : string of interrupt controller, covering : : - affinity setting, : - trigger polarity, : - wake-up, : - DT support : . irqchip/loongson-eiointc: Add DT init support dt-bindings: interrupt-controller: Add Loongson EIOINTC irqchip/loongson-eiointc: Fix irq affinity setting during resume irqchip/loongson-liointc: Add IRQCHIP_SKIP_SET_WAKE flag irqchip/loongson-liointc: Fix IRQ trigger polarity irqchip/loongson-pch-pic: Fix potential incorrect hwirq assignment irqchip/loongson-pch-pic: Fix initialization of HT vector register Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-06-17irqchip/loongson-eiointc: Add DT init supportBinbin Zhou
Add EIOINTC irqchip DT support, which is needed for Loongson chips based on DT and supporting EIOINTC, such as the Loongson-2K0500 SOC. Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/764e02d924094580ac0f1d15535f4b98308705c6.1683279769.git.zhoubinbin@loongson.cn
2023-06-17dt-bindings: interrupt-controller: Add Loongson EIOINTCBinbin Zhou
Add Loongson Extended I/O Interrupt controller binding with DT schema format using json-schema. Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/4369959615eda101e612c450b8974d76ce7e8821.1683279769.git.zhoubinbin@loongson.cn
2023-06-17parisc: Delete redundant register definitions in <asm/assembly.h>Ben Hutchings
We define sp and ipsw in <asm/asmregs.h> using ".reg", and when using current binutils (snapshot 2.40.50.20230611) the definitions in <asm/assembly.h> using "=" conflict with those: arch/parisc/include/asm/assembly.h: Assembler messages: arch/parisc/include/asm/assembly.h:93: Error: symbol `sp' is already defined arch/parisc/include/asm/assembly.h:95: Error: symbol `ipsw' is already defined Delete the duplicate definitions in <asm/assembly.h>. Also delete the definition of gp, which isn't used anywhere. Signed-off-by: Ben Hutchings <benh@debian.org> Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Helge Deller <deller@gmx.de>
2023-06-16Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "A handful of clk driver fixes: - Fix an OOB issue in the Mediatek mt8365 driver where arrays of clks are mismatched in size - Use the proper clk_ops for a few clks in the Mediatek mt8365 driver - Stop using abs() in clk_composite_determine_rate() because 64-bit math goes wrong on large unsigned long numbers that are subtracted and passed into abs() - Zero initialize a struct clk_init_data in clk-loongson2 to avoid stack junk confusing clk_hw_register() - Actually use a pointer to __iomem for writel() in pxa3xx_clk_update_accr() so we don't oops" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: pxa: fix NULL pointer dereference in pxa3xx_clk_update_accr clk: clk-loongson2: Zero init clk_init_data clk: mediatek: mt8365: Fix inverted topclk operations clk: composite: Fix handling of high clock rates clk: mediatek: mt8365: Fix index issue
2023-06-16ksmbd: validate session id and tree id in the compound requestNamjae Jeon
This patch validate session id and tree id in compound request. If first operation in the compound is SMB2 ECHO request, ksmbd bypass session and tree validation. So work->sess and work->tcon could be NULL. If secound request in the compound access work->sess or tcon, It cause NULL pointer dereferecing error. Cc: stable@vger.kernel.org Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-21165 Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-06-16ksmbd: fix out-of-bound read in smb2_writeNamjae Jeon
ksmbd_smb2_check_message doesn't validate hdr->NextCommand. If ->NextCommand is bigger than Offset + Length of smb2 write, It will allow oversized smb2 write length. It will cause OOB read in smb2_write. Cc: stable@vger.kernel.org Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-21164 Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-06-16ksmbd: add mnt_want_write to ksmbd vfs functionsNamjae Jeon
ksmbd is doing write access using vfs helpers. There are the cases that mnt_want_write() is not called in vfs helper. This patch add missing mnt_want_write() to ksmbd vfs functions. Cc: stable@vger.kernel.org Cc: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-06-16ksmbd: validate command payload sizeNamjae Jeon
->StructureSize2 indicates command payload size. ksmbd should validate this size with rfc1002 length before accessing it. This patch remove unneeded check and add the validation for this. [ 8.912583] BUG: KASAN: slab-out-of-bounds in ksmbd_smb2_check_message+0x12a/0xc50 [ 8.913051] Read of size 2 at addr ffff88800ac7d92c by task kworker/0:0/7 ... [ 8.914967] Call Trace: [ 8.915126] <TASK> [ 8.915267] dump_stack_lvl+0x33/0x50 [ 8.915506] print_report+0xcc/0x620 [ 8.916558] kasan_report+0xae/0xe0 [ 8.917080] kasan_check_range+0x35/0x1b0 [ 8.917334] ksmbd_smb2_check_message+0x12a/0xc50 [ 8.917935] ksmbd_verify_smb_message+0xae/0xd0 [ 8.918223] handle_ksmbd_work+0x192/0x820 [ 8.918478] process_one_work+0x419/0x760 [ 8.918727] worker_thread+0x2a2/0x6f0 [ 8.919222] kthread+0x187/0x1d0 [ 8.919723] ret_from_fork+0x1f/0x30 [ 8.919954] </TASK> Cc: stable@vger.kernel.org Reported-by: Chih-Yen Chang <cc85nod@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-06-16Merge tag 'drm-fixes-2023-06-17' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "A bunch of misc fixes across the board. amdgpu is the usual bulk with a revert and other fixes, nouveau has a race fix that was causing a UAF that was hard hanging systems, otherwise some qaic, bridge and radeon. amdgpu: - GFX9 preemption fixes - Add missing radeon secondary PCI ID - vblflash fixes - SMU 13 fix - VCN 4.0 fix - Re-enable TOPDOWN flag for large BAR systems to fix regression - eDP fix - PSR hang fix - DPIA fix radeon: - fbdev client warning fix qaic: - leak fix - null ptr deref fix nouveau: - use-after-free caused by fence race fix - runtime pm fix - NULL ptr checks bridge: - ti-sn65dsi86: Avoid possible buffer overflow" * tag 'drm-fixes-2023-06-17' of git://anongit.freedesktop.org/drm/drm: (21 commits) nouveau: fix client work fence deletion race drm/amd/display: limit DPIA link rate to HBR3 drm/amd/display: fix the system hang while disable PSR drm/amd/display: edp do not add non-edid timings Revert "drm/amdgpu: remove TOPDOWN flags when allocating VRAM in large bar system" drm/amdgpu: vcn_4_0 set instance 0 init sched score to 1 drm/radeon: Disable outputs when releasing fbdev client drm/amd/pm: workaround for compute workload type on some skus drm/amd: Tighten permissions on VBIOS flashing attributes drm/amd: Make sure image is written to trigger VBIOS image update flow drm/amdgpu: add missing radeon secondary PCI ID drm/amdgpu: Implement gfx9 patch functions for resubmission drm/amdgpu: Modify indirect buffer packages for resubmission drm/amdgpu: Program gds backup address as zero if no gds allocated drm/nouveau: add nv_encoder pointer check for NULL drm/amdgpu: Reset CP_VMID_PREEMPT after trailing fence signaled drm/nouveau/dp: check for NULL nv_connector->native_mode drm/bridge: ti-sn65dsi86: Avoid possible buffer overflow drm/nouveau: don't detect DSM for non-NVIDIA device accel/qaic: Fix NULL pointer deref in qaic_destroy_drm_device() ...
2023-06-16afs: Fix vlserver probe RTT handlingDavid Howells
In the same spirit as commit ca57f02295f1 ("afs: Fix fileserver probe RTT handling"), don't rule out using a vlserver just because there haven't been enough packets yet to calculate a real rtt. Always set the server's probe rtt from the estimate provided by rxrpc_kernel_get_srtt, which is capped at 1 second. This could lead to EDESTADDRREQ errors when accessing a cell for the first time, even though the vl servers are known and have responded to a probe. Fixes: 1d4adfaf6574 ("rxrpc: Make rxrpc_kernel_get_srtt() indicate validity") Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-afs@lists.infradead.org Link: http://lists.infradead.org/pipermail/linux-afs/2023-June/006746.html Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-16Merge tag 'v6.4-rockchip-dtsfixes1' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into arm/fixes Fixes for the reset pin on nanopi r5c, a reset line on SOQuartz, a duplicate usb regulator on rock64 and PCIe register mappings on rk356x. Also some missing cache properties. * tag 'v6.4-rockchip-dtsfixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: arm64: dts: rockchip: Fix rk356x PCIe register and range mappings arm64: dts: rockchip: fix button reset pin for nanopi r5c arm64: dts: rockchip: fix nEXTRST on SOQuartz arm64: dts: rockchip: add missing cache properties arm64: dts: rockchip: fix USB regulator on ROCK64 Link: https://lore.kernel.org/r/2885657.e9J7NaK4W3@phil Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-06-16x86/mem_encrypt: Unbreak the AMD_MEM_ENCRYPT=n buildThomas Gleixner
Moving mem_encrypt_init() broke the AMD_MEM_ENCRYPT=n because the declaration of that function was under #ifdef CONFIG_AMD_MEM_ENCRYPT and the obvious placement for the inline stub was the #else path. This is a leftover of commit 20f07a044a76 ("x86/sev: Move common memory encryption code to mem_encrypt.c") which made mem_encrypt_init() depend on X86_MEM_ENCRYPT without moving the prototype. That did not fail back then because there was no stub inline as the core init code had a weak function. Move both the declaration and the stub out of the CONFIG_AMD_MEM_ENCRYPT section and guard it with CONFIG_X86_MEM_ENCRYPT. Fixes: 439e17576eb4 ("init, x86: Move mem_encrypt_init() into arch_cpu_finalize_init()") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Closes: https://lore.kernel.org/oe-kbuild-all/202306170247.eQtCJPE8-lkp@intel.com/
2023-06-16ieee802154: Replace strlcpy with strscpyAzeem Shaikh
strlcpy() reads the entire source buffer first. This read may exceed the destination size limit. This is both inefficient and can lead to linear read overflows if a source string is not NUL-terminated [1]. In an effort to remove strlcpy() completely [2], replace strlcpy() here with strscpy(). Direct replacement is safe here since the return values from the helper macros are ignored by the callers. [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy [2] https://github.com/KSPP/linux/issues/89 Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230613003326.3538391-1-azeemshaikh38@gmail.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2023-06-17Merge tag 'drm-misc-fixes-2023-06-16' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes drm-misc-fixes maybe in time for v6.4-rc7: - qaic leak and null deref fix. - Fix runtime pm in nouveau. - Fix array overflow in ti-sn65dsi86 pwm chip handling. - Assorted null check fixes in nouveau. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <dev@lankhorst.se> Link: https://patchwork.freedesktop.org/patch/msgid/641eb8a8-fbd7-90ad-0805-310b7fec9344@lankhorst.se
2023-06-16sched/core: Avoid multiple calling update_rq_clock() in __cfsb_csd_unthrottle()Hao Jia
After commit 8ad075c2eb1f ("sched: Async unthrottling for cfs bandwidth"), we may update the rq clock multiple times in the loop of __cfsb_csd_unthrottle(). A prior (although less common) instance of this problem exists in unthrottle_offline_cfs_rqs(). Cure both by ensuring update_rq_clock() is called before the loop and setting RQCF_ACT_SKIP during the loop, to supress further updates. The alternative would be pulling update_rq_clock() out of unthrottle_cfs_rq(), but that gives an even bigger mess. Fixes: 8ad075c2eb1f ("sched: Async unthrottling for cfs bandwidth") Reviewed-By: Ben Segall <bsegall@google.com> Suggested-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Hao Jia <jiahao.os@bytedance.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20230613082012.49615-4-jiahao.os@bytedance.com
2023-06-16sched/core: Avoid double calling update_rq_clock() in __balance_push_cpu_stop()Hao Jia
There is a double update_rq_clock() invocation: __balance_push_cpu_stop() update_rq_clock() __migrate_task() update_rq_clock() Sadly select_fallback_rq() also needs update_rq_clock() for __do_set_cpus_allowed(), it is not possible to remove the update from __balance_push_cpu_stop(). So remove it from __migrate_task() and ensure all callers of this function call update_rq_clock() prior to calling it. Signed-off-by: Hao Jia <jiahao.os@bytedance.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20230613082012.49615-3-jiahao.os@bytedance.com
2023-06-16sched/core: Fixed missing rq clock update before calling set_rq_offline()Hao Jia
When using a cpufreq governor that uses cpufreq_add_update_util_hook(), it is possible to trigger a missing update_rq_clock() warning for the CPU hotplug path: rq_attach_root() set_rq_offline() rq_offline_rt() __disable_runtime() sched_rt_rq_enqueue() enqueue_top_rt_rq() cpufreq_update_util() data->func(data, rq_clock(rq), flags) Move update_rq_clock() from sched_cpu_deactivate() (one of it's callers) into set_rq_offline() such that it covers all set_rq_offline() usage. Additionally change rq_attach_root() to use rq_lock_irqsave() so that it will properly manage the runqueue clock flags. Suggested-by: Ben Segall <bsegall@google.com> Signed-off-by: Hao Jia <jiahao.os@bytedance.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20230613082012.49615-2-jiahao.os@bytedance.com
2023-06-16sched/deadline: Update GRUB description in the documentationVineeth Pillai
Update the details of GRUB to reflect the updated logic. Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Daniel Bristot de Oliveira <bristot@kernel.org> Acked-by: Juri Lelli <juri.lelli@redhat.com> Link: https://lore.kernel.org/r/20230530135526.2385378-2-vineeth@bitbyteword.org
2023-06-16sched/deadline: Fix bandwidth reclaim equation in GRUBVineeth Pillai
According to the GRUB[1] rule, the runtime is depreciated as: "dq = -max{u, (1 - Uinact - Uextra)} dt" (1) To guarantee that deadline tasks doesn't starve lower class tasks, we do not allocate the full bandwidth of the cpu to deadline tasks. Maximum bandwidth usable by deadline tasks is denoted by "Umax". Considering Umax, equation (1) becomes: "dq = -(max{u, (Umax - Uinact - Uextra)} / Umax) dt" (2) Current implementation has a minor bug in equation (2), which this patch fixes. The reclamation logic is verified by a sample program which creates multiple deadline threads and observing their utilization. The tests were run on an isolated cpu(isolcpus=3) on a 4 cpu system. Tests on 6.3.0 ============== RUN 1: runtime=7ms, deadline=period=10ms, RT capacity = 95% TID[693]: RECLAIM=1, (r=7ms, d=10ms, p=10ms), Util: 93.33 TID[693]: RECLAIM=1, (r=7ms, d=10ms, p=10ms), Util: 93.35 RUN 2: runtime=1ms, deadline=period=100ms, RT capacity = 95% TID[708]: RECLAIM=1, (r=1ms, d=100ms, p=100ms), Util: 16.69 TID[708]: RECLAIM=1, (r=1ms, d=100ms, p=100ms), Util: 16.69 RUN 3: 2 tasks Task 1: runtime=1ms, deadline=period=10ms Task 2: runtime=1ms, deadline=period=100ms TID[631]: RECLAIM=1, (r=1ms, d=10ms, p=10ms), Util: 62.67 TID[632]: RECLAIM=1, (r=1ms, d=100ms, p=100ms), Util: 6.37 TID[631]: RECLAIM=1, (r=1ms, d=10ms, p=10ms), Util: 62.38 TID[632]: RECLAIM=1, (r=1ms, d=100ms, p=100ms), Util: 6.23 As seen above, the reclamation doesn't reclaim the maximum allowed bandwidth and as the bandwidth of tasks gets smaller, the reclaimed bandwidth also comes down. Tests with this patch applied ============================= RUN 1: runtime=7ms, deadline=period=10ms, RT capacity = 95% TID[608]: RECLAIM=1, (r=7ms, d=10ms, p=10ms), Util: 95.19 TID[608]: RECLAIM=1, (r=7ms, d=10ms, p=10ms), Util: 95.16 RUN 2: runtime=1ms, deadline=period=100ms, RT capacity = 95% TID[616]: RECLAIM=1, (r=1ms, d=100ms, p=100ms), Util: 95.27 TID[616]: RECLAIM=1, (r=1ms, d=100ms, p=100ms), Util: 95.21 RUN 3: 2 tasks Task 1: runtime=1ms, deadline=period=10ms Task 2: runtime=1ms, deadline=period=100ms TID[620]: RECLAIM=1, (r=1ms, d=10ms, p=10ms), Util: 86.64 TID[621]: RECLAIM=1, (r=1ms, d=100ms, p=100ms), Util: 8.66 TID[620]: RECLAIM=1, (r=1ms, d=10ms, p=10ms), Util: 86.45 TID[621]: RECLAIM=1, (r=1ms, d=100ms, p=100ms), Util: 8.73 Running tasks on all cpus allowing for migration also showed that the utilization is reclaimed to the maximum. Running 10 tasks on 3 cpus SCHED_FLAG_RECLAIM - top shows: %Cpu0 : 94.6 us, 0.0 sy, 0.0 ni, 5.4 id, 0.0 wa %Cpu1 : 95.2 us, 0.0 sy, 0.0 ni, 4.8 id, 0.0 wa %Cpu2 : 95.8 us, 0.0 sy, 0.0 ni, 4.2 id, 0.0 wa [1]: Abeni, Luca & Lipari, Giuseppe & Parri, Andrea & Sun, Youcheng. (2015). Parallel and sequential reclaiming in multicore real-time global scheduling. Signed-off-by: Vineeth Pillai (Google) <vineeth@bitbyteword.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Daniel Bristot de Oliveira <bristot@kernel.org> Acked-by: Juri Lelli <juri.lelli@redhat.com> Link: https://lore.kernel.org/r/20230530135526.2385378-1-vineeth@bitbyteword.org
2023-06-16trace,smp: Add tracepoints for scheduling remotelly called functionsLeonardo Bras
Add a tracepoint for when a CSD is queued to a remote CPU's call_single_queue. This allows finding exactly which CPU queued a given CSD when looking at a csd_function_{entry,exit} event, and also enables us to accurately measure IPI delivery time with e.g. a synthetic event: $ echo 'hist:keys=cpu,csd.hex:ts=common_timestamp.usecs' >\ /sys/kernel/tracing/events/smp/csd_queue_cpu/trigger $ echo 'csd_latency unsigned int dst_cpu; unsigned long csd; u64 time' >\ /sys/kernel/tracing/synthetic_events $ echo \ 'hist:keys=common_cpu,csd.hex:'\ 'time=common_timestamp.usecs-$ts:'\ 'onmatch(smp.csd_queue_cpu).trace(csd_latency,common_cpu,csd,$time)' >\ /sys/kernel/tracing/events/smp/csd_function_entry/trigger $ trace-cmd record -e 'synthetic:csd_latency' hackbench $ trace-cmd report <...>-467 [001] 21.824263: csd_queue_cpu: cpu=0 callsite=try_to_wake_up+0x2ea func=sched_ttwu_pending csd=0xffff8880076148b8 <...>-467 [001] 21.824280: ipi_send_cpu: cpu=0 callsite=try_to_wake_up+0x2ea callback=generic_smp_call_function_single_interrupt+0x0 <...>-489 [000] 21.824299: csd_function_entry: func=sched_ttwu_pending csd=0xffff8880076148b8 <...>-489 [000] 21.824320: csd_latency: dst_cpu=0, csd=18446612682193848504, time=36 Suggested-by: Valentin Schneider <vschneid@redhat.com> Signed-off-by: Leonardo Bras <leobras@redhat.com> Tested-and-reviewed-by: Valentin Schneider <vschneid@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230615065944.188876-7-leobras@redhat.com
2023-06-16trace,smp: Add tracepoints around remotelly called functionsLeonardo Bras
The recently added ipi_send_{cpu,cpumask} tracepoints allow finding sources of IPIs targeting CPUs running latency-sensitive applications. For NOHZ_FULL CPUs, all IPIs are interference, and those tracepoints are sufficient to find them and work on getting rid of them. In some setups however, not *all* IPIs are to be suppressed, but long-running IPI callbacks can still be problematic. Add a pair of tracepoints to mark the start and end of processing a CSD IPI callback, similar to what exists for softirq, workqueue or timer callbacks. Signed-off-by: Leonardo Bras <leobras@redhat.com> Tested-and-reviewed-by: Valentin Schneider <vschneid@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230615065944.188876-5-leobras@redhat.com