summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-03-21Merge tag 'char-misc-6.9-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc and other driver subsystem updates from Greg KH: "Here is the big set of char/misc and a number of other driver subsystem updates for 6.9-rc1. Included in here are: - IIO driver updates, loads of new ones and evolution of existing ones - coresight driver updates - const cleanups for many driver subsystems - speakup driver additions - platform remove callback void cleanups - mei driver updates - mhi driver updates - cdx driver updates for MSI interrupt handling - nvmem driver updates - other smaller driver updates and cleanups, full details in the shortlog All of these have been in linux-next for a long time with no reported issue, other than a build warning for the speakup driver" The build warning hits clang and is a gcc (and C23) extension, and is fixed up in the merge. Link: https://lore.kernel.org/all/20240321134831.GA2762840@dev-arch.thelio-3990X/ * tag 'char-misc-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (279 commits) binder: remove redundant variable page_addr uio_dmem_genirq: UIO_MEM_DMA_COHERENT conversion uio_pruss: UIO_MEM_DMA_COHERENT conversion cnic,bnx2,bnx2x: use UIO_MEM_DMA_COHERENT uio: introduce UIO_MEM_DMA_COHERENT type cdx: add MSI support for CDX bus pps: use cflags-y instead of EXTRA_CFLAGS speakup: Add /dev/synthu device speakup: Fix 8bit characters from direct synth parport: sunbpp: Convert to platform remove callback returning void parport: amiga: Convert to platform remove callback returning void char: xillybus: Convert to platform remove callback returning void vmw_balloon: change maintainership MAINTAINERS: change the maintainer for hpilo driver char: xilinx_hwicap: Fix NULL vs IS_ERR() bug hpet: remove hpets::hp_clocksource platform: goldfish: move the separate 'default' propery for CONFIG_GOLDFISH char: xilinx_hwicap: drop casting to void in dev_set_drvdata greybus: move is_gb_* functions out of greybus.h greybus: Remove usage of the deprecated ida_simple_xx() API ...
2024-03-21Merge tag 'staging-6.9-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging driver updates from Greg KH: "Here is the big set of Staging driver cleanups for 6.9-rc1. Nothing major in here, lots of small coding style cleanups for most drivers, and the removal of some obsolete hardare (the emxx_udc and some drivers/staging/board/ files). All of these have been in linux-next for a long time with no reported issues" * tag 'staging-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (122 commits) staging: greybus: Replaces directive __attribute__((packed)) by __packed as suggested by checkpatch staging: greybus: Replace __attribute__((packed)) by __packed in various instances Staging: rtl8192e: Rename function GetHalfNmodeSupportByAPsHandler() Staging: rtl8192e: Rename function rtllib_FlushRxTsPendingPkts() Staging: rtl8192e: Rename goto OnADDBARsp_Reject Staging: rtl8192e: Rename goto OnADDBAReq_Fail Staging: rtl8192e: Rename function rtllib_send_ADDBARsp() Staging: rtl8192e: Rename function rtllib_send_ADDBAReq() Staging: rtl8192e: Rename variable TxRxSelect Staging: rtl8192e: Fix 5 chckpatch alignment warnings in rtl819x_BAProc.c Staging: rtl8192e: Rename function MgntQuery_MgntFrameTxRate Staging: rtl8192e: Rename boolean variable bHalfWirelessN24GMode Staging: rtl8192e: Rename reference AllowAllDestAddrHandler Staging: rtl8192e: Rename varoable asSta Staging: rtl8192e: Rename varoable osCcxVerNum Staging: rtl8192e: Rename variable CcxAironetBuf Staging: rtl8192e: Rename variable osCcxAironetIE Staging: rtl8192e: Rename variable AironetIeOui Staging: rtl8192e: Rename variable asRsn Staging: rtl8192e: Rename variable CcxVerNumBuf ...
2024-03-21Merge tag 'tty-6.9-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty / serial driver updates from Greg KH: "Here is the big set of TTY/Serial driver updates and cleanups for 6.9-rc1. Included in here are: - more tty cleanups from Jiri - loads of 8250 driver cleanups from Andy - max310x driver updates - samsung serial driver updates - uart_prepare_sysrq_char() updates for many drivers - platform driver remove callback void cleanups - stm32 driver updates - other small tty/serial driver updates All of these have been in linux-next for a long time with no reported issues" * tag 'tty-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (199 commits) dt-bindings: serial: stm32: add power-domains property serial: 8250_dw: Replace ACPI device check by a quirk serial: Lock console when calling into driver before registration serial: 8250_uniphier: Switch to use uart_read_port_properties() serial: 8250_tegra: Switch to use uart_read_port_properties() serial: 8250_pxa: Switch to use uart_read_port_properties() serial: 8250_omap: Switch to use uart_read_port_properties() serial: 8250_of: Switch to use uart_read_port_properties() serial: 8250_lpc18xx: Switch to use uart_read_port_properties() serial: 8250_ingenic: Switch to use uart_read_port_properties() serial: 8250_dw: Switch to use uart_read_port_properties() serial: 8250_bcm7271: Switch to use uart_read_port_properties() serial: 8250_bcm2835aux: Switch to use uart_read_port_properties() serial: 8250_aspeed_vuart: Switch to use uart_read_port_properties() serial: port: Introduce a common helper to read properties serial: core: Add UPIO_UNKNOWN constant for unknown port type serial: core: Move struct uart_port::quirks closer to possible values serial: sh-sci: Call sci_serial_{in,out}() directly serial: core: only stop transmit when HW fifo is empty serial: pch: Use uart_prepare_sysrq_char(). ...
2024-03-21Merge tag 'usb-6.9-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB / Thunderbolt updates from Greg KH: "Here is the big set of USB and Thunderbolt changes for 6.9-rc1. Lots of tiny changes and forward progress to support new hardware and better support for existing devices. Included in here are: - Thunderbolt (i.e. USB4) updates for newer hardware and uses as more people start to use the hardware - default USB authentication mode Kconfig and documentation update to make it more obvious what is going on - USB typec updates and enhancements - usual dwc3 driver updates - usual xhci driver updates - function USB (i.e. gadget) driver updates and additions - new device ids for lots of drivers - loads of other small updates, full details in the shortlog All of these, including a "last minute regression fix" have been in linux-next with no reported issues" * tag 'usb-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (185 commits) usb: usb-acpi: Fix oops due to freeing uninitialized pld pointer usb: gadget: net2272: Use irqflags in the call to net2272_probe_fin usb: gadget: tegra-xudc: Fix USB3 PHY retrieval logic phy: tegra: xusb: Add API to retrieve the port number of phy USB: gadget: pxa27x_udc: Remove unused of_gpio.h usb: gadget/snps_udc_plat: Remove unused of_gpio.h usb: ohci-pxa27x: Remove unused of_gpio.h usb: sl811-hcd: only defined function checkdone if QUIRK2 is defined usb: Clarify expected behavior of dev_bin_attrs_are_visible() xhci: Allow RPM on the USB controller (1022:43f7) by default usb: isp1760: remove SLAB_MEM_SPREAD flag usage usb: misc: onboard_hub: use pointer consistently in the probe function usb: gadget: fsl: Increase size of name buffer for endpoints usb: gadget: fsl: Add of device table to enable module autoloading usb: typec: tcpm: add support to set tcpc connector orientatition usb: typec: tcpci: add generic tcpci fallback compatible dt-bindings: usb: typec-tcpci: add tcpci fallback binding usb: gadget: fsl-udc: Replace custom log wrappers by dev_{err,warn,dbg,vdbg} usb: core: Set connect_type of ports based on DT node dt-bindings: usb: Add downstream facing ports to realtek binding ...
2024-03-21sched/doc: Update documentation for base_slice_ns and CONFIG_HZ relationMukesh Kumar Chaurasiya
The tunable base_slice_ns is dependent on CONFIG_HZ (i.e. TICK_NSEC) for any significant performance improvement. The reason being the scheduler tick is not frequent enough to force preemption when base_slice expires in case of: base_slice_ns < TICK_NSEC The below data is of stress-ng: Number of CPU: 1 Stressor threads: 4 Time: 30sec On CONFIG_HZ=1000 | base_slice | avg-run (msec) | context-switches | | ---------- | -------------- | ---------------- | | 3ms | 2.914 | 10342 | | 6ms | 4.857 | 6196 | | 9ms | 6.754 | 4482 | | 12ms | 7.872 | 3802 | | 22ms | 11.294 | 2710 | | 32ms | 13.425 | 2284 | On CONFIG_HZ=100 | base_slice | avg-run (msec) | context-switches | | ---------- | -------------- | ---------------- | | 3ms | 9.144 | 3337 | | 6ms | 9.113 | 3301 | | 9ms | 8.991 | 3315 | | 12ms | 12.935 | 2328 | | 22ms | 16.031 | 1915 | | 32ms | 18.608 | 1622 | base_slice: the value of base_slice in ms avg-run (msec): average time of the stressor threads got on cpu before it got preempted context-switches: number of context switches for the stress-ng process Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20240320173815.927637-2-mchauras@linux.ibm.com
2024-03-21Merge tag 'nvme-6.9-2024-03-21' of git://git.infradead.org/nvme into block-6.9Jens Axboe
Pull NVMe fixes from Keith: "nvme updates for Linux 6.9 - Make an informative message less ominous (Keith) - Enhanced trace decoding (Guixin) - TCP updates (Hannes, Li) - Fabrics connect deadlock fix (Chunguang) - Platform API migration update (Uwe) - A new device quirk (Jiawei)" * tag 'nvme-6.9-2024-03-21' of git://git.infradead.org/nvme: nvmet-rdma: remove NVMET_RDMA_REQ_INVALIDATE_RKEY flag nvme: remove redundant BUILD_BUG_ON check nvme/tcp: Add wq_unbound modparam for nvme_tcp_wq nvme-tcp: Export the nvme_tcp_wq to sysfs drivers/nvme: Add quirks for device 126f:2262 nvme: parse format command's lbafu when tracing nvme: add tracing of reservation commands nvme: parse zns command's zsa and zrasf to string nvme: use nvme_disk_is_ns_head helper nvme: fix reconnection fail due to reserved tag allocation nvmet: add tracing of zns commands nvmet: add tracing of authentication commands nvme-apple: Convert to platform remove callback returning void nvmet-tcp: do not continue for invalid icreq nvme: change shutdown timeout setting message
2024-03-21fbdev: panel-tpo-td043mtea1: Convert sprintf() to sysfs_emit()Li Zhijian
Per filesystems/sysfs.rst, show() should only use sysfs_emit() or sysfs_emit_at() when formatting the value to be returned to user space. coccinelle complains that there are still a couple of functions that use snprintf(). Convert them to sysfs_emit(). CC: Helge Deller <deller@gmx.de> CC: linux-omap@vger.kernel.org CC: linux-fbdev@vger.kernel.org CC: dri-devel@lists.freedesktop.org Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Helge Deller <deller@gmx.de>
2024-03-21libbpf: Define MFD_CLOEXEC if not availableArnaldo Carvalho de Melo
Since its going directly to the syscall to avoid not having memfd_create() available in some systems, do the same for its MFD_CLOEXEC flags, defining it if not available. This fixes the build in those systems, noticed while building perf on a set of build containers. Fixes: 9fa5e1a180aa639f ("libbpf: Call memfd_create() syscall directly") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/ZfxZ9nCyKvwmpKkE@x1
2024-03-21Merge tag 'hwlock-v6.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux Pull hwspinlock updates from Bjorn Andersson: "Some code cleanup for the OMAP hwspinlock driver" * tag 'hwlock-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux: hwspinlock: omap: Use index to get hwspinlock pointer hwspinlock: omap: Use devm_hwspin_lock_register() helper hwspinlock: omap: Use devm_pm_runtime_enable() helper hwspinlock: omap: Remove unneeded check for OF node
2024-03-21nvmet-rdma: remove NVMET_RDMA_REQ_INVALIDATE_RKEY flagGuixin Liu
We can simply use invalidate_rkey to check instead of adding a flag. Signed-off-by: Guixin Liu <kanie@linux.alibaba.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-03-21nvme: remove redundant BUILD_BUG_ON checkGuixin Liu
Remove redundant BUILD_BUG_ON check of struct nvme_dsm_range, it's already checked in nvme_init_ctrl(). Signed-off-by: Guixin Liu <kanie@linux.alibaba.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-03-21Merge tag 'rpmsg-v6.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux Pull rpmsg updates from Bjorn Andersson: "This transitions rpmsg_ctrl and rpmsg_char drivers away from the deprecated ida_simple_*() API. It also makes the rpmsg_bus const" * tag 'rpmsg-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux: rpmsg: core: Make rpmsg_bus const rpmsg: Remove usage of the deprecated ida_simple_xx() API
2024-03-21Merge tag 'rproc-v6.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux Pull remoteproc updates from Bjorn Andersson: "Qualcomm SM8650 audio, compute and modem remoteproc are added. Qualcomm X1 Elite audio and compute remoteprocs are added, after support for shutting down the bootloader-loaded firmware loaded into the audio DSP.. A dozen drivers in the subsystem are transitioned to use devres helpers for remoteproc and memory allocations - this makes it possible to acquire in-kernel handle to individual remoteproc instances in a cluster. The release of DMA memory for remoteproc virtio is corrected to ensure that restarting due to a watchdog bite doesn't attempt to allocate the memory again without first freeing it. Last, but not least, a couple of DeviceTree binding cleanups" * tag 'rproc-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux: (30 commits) remoteproc: qcom_q6v5_pas: Unload lite firmware on ADSP remoteproc: qcom_q6v5_pas: Add support for X1E80100 ADSP/CDSP dt-bindings: remoteproc: qcom,sm8550-pas: document the X1E80100 aDSP & cDSP remoteproc: qcom_wcnss: Use devm_rproc_alloc() helper remoteproc: qcom_q6v5_wcss: Use devm_rproc_alloc() helper remoteproc: qcom_q6v5_pas: Use devm_rproc_alloc() helper remoteproc: qcom_q6v5_mss: Use devm_rproc_alloc() helper remoteproc: qcom_q6v5_adsp: Use devm_rproc_alloc() helper dt-bindings: remoteproc: do not override firmware-name $ref dt-bindings: remoteproc: qcom,glink-rpm-edge: drop redundant type from label remoteproc: qcom: pas: correct data indentation remoteproc: Make rproc_get_by_phandle() work for clusters remoteproc: qcom: pas: Add SM8650 remoteproc support remoteproc: qcom: pas: make region assign more generic dt-bindings: remoteproc: qcom,sm8550-pas: document the SM8650 PAS remoteproc: k3-dsp: Use devm_rproc_add() helper remoteproc: k3-dsp: Use devm_ioremap_wc() helper remoteproc: k3-dsp: Add devm action to release tsp remoteproc: k3-dsp: Use devm_kzalloc() helper remoteproc: k3-dsp: Use devm_ti_sci_get_by_phandle() helper ...
2024-03-21dm-integrity: align the outgoing bio in integrity_recheckMikulas Patocka
It is possible to set up dm-integrity with smaller sector size than the logical sector size of the underlying device. In this situation, dm-integrity guarantees that the outgoing bios have the same alignment as incoming bios (so, if you create a filesystem with 4k block size, dm-integrity would send 4k-aligned bios to the underlying device). This guarantee was broken when integrity_recheck was implemented. integrity_recheck sends bio that is aligned to ic->sectors_per_block. So if we set up integrity with 512-byte sector size on a device with logical block size 4k, we would be sending unaligned bio. This triggered a bug in one of our internal tests. This commit fixes it by determining the actual alignment of the incoming bio and then makes sure that the outgoing bio in integrity_recheck has the same alignment. Fixes: c88f5e553fe3 ("dm-integrity: recheck the integrity tag after a failure") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2024-03-21Merge tag 'cocci-6.9-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux Pull coccinelle update from Julia Lawall: "Simplify the device_attr_show semantic patch Also removes an unused variable warning" * tag 'cocci-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlawall/linux: coccinelle: device_attr_show: Remove useless expression STR
2024-03-21Merge tag 'sh-for-v6.9-tag1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux Pull sh updates from John Paul Adrian Glaubitz: "Two patches by Ricardo B. Marliere make two instances of struct bus_type in the interrupt controller driver and the DMA sysfs interface const since the driver core in the kernel is now able to handle that. A third patch by Artur Rojek enforces internal linkage for the function setup_hd64461() in order to fix the build of hp6xx_defconfig with -Werror=missing-prototypes" * tag 'sh-for-v6.9-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux: sh: hd64461: Make setup_hd64461() static sh: intc: Make intc_subsys const sh: dma-sysfs: Make dma_subsys const
2024-03-21exec: Fix NOMMU linux_binprm::exec in transfer_args_to_stack()Max Filippov
In NOMMU kernel the value of linux_binprm::p is the offset inside the temporary program arguments array maintained in separate pages in the linux_binprm::page. linux_binprm::exec being a copy of linux_binprm::p thus must be adjusted when that array is copied to the user stack. Without that adjustment the value passed by the NOMMU kernel to the ELF program in the AT_EXECFN entry of the aux array doesn't make any sense and it may break programs that try to access memory pointed to by that entry. Adjust linux_binprm::exec before the successful return from the transfer_args_to_stack(). Cc: <stable@vger.kernel.org> Fixes: b6a2fea39318 ("mm: variable length argument support") Fixes: 5edc2a5123a7 ("binfmt_elf_fdpic: wire up AT_EXECFD, AT_EXECFN, AT_SECURE") Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Link: https://lore.kernel.org/r/20240320182607.1472887-1-jcmvbkbc@gmail.com Signed-off-by: Kees Cook <keescook@chromium.org>
2024-03-21Merge tag 'hyperv-next-signed-20240320' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - Use Hyper-V entropy to seed guest random number generator (Michael Kelley) - Convert to platform remove callback returning void for vmbus (Uwe Kleine-König) - Introduce hv_get_hypervisor_version function (Nuno Das Neves) - Rename some HV_REGISTER_* defines for consistency (Nuno Das Neves) - Change prefix of generic HV_REGISTER_* MSRs to HV_MSR_* (Nuno Das Neves) - Cosmetic changes for hv_spinlock.c (Purna Pavan Chandra Aekkaladevi) - Use per cpu initial stack for vtl context (Saurabh Sengar) * tag 'hyperv-next-signed-20240320' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: x86/hyperv: Use Hyper-V entropy to seed guest random number generator x86/hyperv: Cosmetic changes for hv_spinlock.c hyperv-tlfs: Rename some HV_REGISTER_* defines for consistency hv: vmbus: Convert to platform remove callback returning void mshyperv: Introduce hv_get_hypervisor_version function x86/hyperv: Use per cpu initial stack for vtl context hyperv-tlfs: Change prefix of generic HV_REGISTER_* MSRs to HV_MSR_*
2024-03-21Merge tag 'for-6.9-part2-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "Fix a problem found in 6.7 after adding the temp-fsid feature which changed device tracking in memory and broke grub-probe. This is used on initrd-less systems. There were several iterations of the fix and it took longer than expected" * tag 'for-6.9-part2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: do not skip re-registration for the mounted device
2024-03-21Merge tag 'exfat-for-6.9-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat Pull exfat updates from Namjae Jeon: - Improve dirsync performance by syncing on a dentry-set rather than on a per-directory entry * tag 'exfat-for-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat: exfat: remove duplicate update parent dir exfat: do not sync parent dir if just update timestamp exfat: remove unused functions exfat: convert exfat_find_empty_entry() to use dentry cache exfat: convert exfat_init_ext_entry() to use dentry cache exfat: move free cluster out of exfat_init_ext_entry() exfat: convert exfat_remove_entries() to use dentry cache exfat: convert exfat_add_entry() to use dentry cache exfat: add exfat_get_empty_dentry_set() helper exfat: add __exfat_get_dentry_set() helper
2024-03-21Merge tag 'bitmap-for-6.9' of https://github.com/norov/linuxLinus Torvalds
Pull bitmap updates from Yury Norov: "A couple of random cleanups plus a step-down patch from Andy" * tag 'bitmap-for-6.9' of https://github.com/norov/linux: bitmap: Step down as a reviewer lib/find: optimize find_*_bit_wrap lib/find_bit: Fix the code comments about find_next_bit_wrap
2024-03-21drm/i915: Do not print 'pxp init failed with 0' when it succeedJosé Roberto de Souza
It is misleading, if the intention was to also print something in case it succeed it should have a different string. Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Fixes: 698e19da2914 ("drm/i915: Skip pxp init if gt is wedged") Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240320210547.71937-1-jose.souza@intel.com
2024-03-21Merge tag 'nf-24-03-21' of ↵Paolo Abeni
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net. There is a larger batch of fixes still pending that will follow up asap, this is what I deemed to be more urgent at this time: 1) Use clone view in pipapo set backend to release elements from destroy path, otherwise it is possible to destroy elements twice. 2) Incorrect check for internal table flags lead to bogus transaction objects. 3) Fix counters memleak in netdev basechain update error path, from Quan Tian. netfilter pull request 24-03-21 * tag 'nf-24-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_tables: Fix a memory leak in nf_tables_updchain netfilter: nf_tables: do not compare internal table flags on updates netfilter: nft_set_pipapo: release elements in clone only from destroy path ==================== Link: https://lore.kernel.org/r/20240321112117.36737-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-21Merge tag 'asoc-fix-v6.9-merge-window' of ↵Takashi Iwai
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v6.9 A bunch of fixes that came in during the merge window, probably the most substantial thing is the DPCM locking fix for compressed audio which has been lurking for a while.
2024-03-21firewire: core: add memo about the caller of show functions for device ↵Takashi Sakamoto
attributes In the case of firewire core function, the caller of show functions for device attributes is not only sysfs user, but also device initialization. This commit adds memo about it against the typical assumption that the functions are just dedicated to sysfs user. Link: https://lore.kernel.org/lkml/20240318091759.678326-1-o-takashi@sakamocchi.jp/ Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
2024-03-21drm/i915/cx0: pass encoder instead of i915 and port aroundJani Nikula
The encoder is a much more useful thing to pass around than the i915 and port combo. Also drive-by clean up some cases where both i915 and encoder are passed; only the latter is needed. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/f9308e47a3a66bd74479480964c8a538e3f6a358.1710949619.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-03-21drm/i915/cx0: remove the unused intel_is_c10phy()Jani Nikula
The intel_is_c10phy() is now unused. Remove. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/486ad2832c567ae491726c6c0cd7144e14469a2f.1710949619.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-03-21drm/i915/display: use intel_encoder_is/to_* functionsJani Nikula
Wherever possible, replace the port/phy based functions with the encoder based functions: intel_is_c10phy() -> intel_encoder_is_c10phy() intel_phy_is_combo() -> intel_encoder_is_combo() intel_phy_is_tc() -> intel_encoder_is_tc() intel_port_to_phy() -> intel_encoder_to_phy() intel_port_to_tc() -> intel_encoder_to_tc() Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ce8d116fcdd7662fa0a0817200a8e6fda313e496.1710949619.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-03-21drm/i915/display: add intel_encoder_is_*() and _to_*() functionsJani Nikula
Add a number of encoder based functions to check if the port/phy of the encoder is of a certain type, or to convert to phy or tc_port. Initially these are just wrappers around the existing functions, but they can be improved to use VBT data or use some cached info in the future. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/7b2d350ee42883f2784030c649d16f983bd407bd.1710949619.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-03-21drm/i915/snps: pass encoder to intel_snps_phy_update_psr_power_state()Jani Nikula
Pass encoder to intel_snps_phy_update_psr_power_state(). The encoder will be more helpful than just port in the subsequent changes. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/4711919a9834cf4a49fd665009ba9d44b4b42bc4.1710949619.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-03-21drm/i915/ddi: pass encoder to intel_wait_ddi_buf_active()Jani Nikula
Pass encoder to intel_wait_ddi_buf_active(). The encoder will be more helpful than just port in the subsequent changes. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/6a299c4c575a260c0ba88b2e99931d48945269be.1710949619.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-03-21drm/i915/hdmi: convert *_port_to_ddc_pin() to *_encoder_to_ddc_pin()Jani Nikula
Pass encoder to the _port_to_ddc_pin() functions, and rename to _encoder_to_ddc_pin(). The encoder will be more helpful than just port in the subsequent changes. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/c94debf36816157de1105a186b061fd90dab574a.1710949619.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-03-21Merge tag 'linux-can-fixes-for-6.9-20240319' of ↵Paolo Abeni
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2024-03-20 this is a pull request of 1 patch for net/master. Martin Jocić contributes a fix for the kvaser_pciefd driver, so that up to 8 channels on the Xilinx-based adapters can be used. This issue has been introduced in net-next for v6.9. linux-can-fixes-for-6.9-20240319 * tag 'linux-can-fixes-for-6.9-20240319' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: kvaser_pciefd: Add additional Xilinx interrupts ==================== Link: https://lore.kernel.org/r/20240320112144.582741-1-mkl@pengutronix.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-21selftests: forwarding: Fix ping failure due to short timeoutIdo Schimmel
The tests send 100 pings in 0.1 second intervals and force a timeout of 11 seconds, which is borderline (especially on debug kernels), resulting in random failures in netdev CI [1]. Fix by increasing the timeout to 20 seconds. It should not prolong the test unless something is wrong, in which case the test will rightfully fail. [1] # selftests: net/forwarding: vxlan_bridge_1d_port_8472_ipv6.sh # INFO: Running tests with UDP port 8472 # TEST: ping: local->local [ OK ] # TEST: ping: local->remote 1 [FAIL] # Ping failed [...] Fixes: b07e9957f220 ("selftests: forwarding: Add VxLAN tests with a VLAN-unaware bridge for IPv6") Fixes: 728b35259e28 ("selftests: forwarding: Add VxLAN tests with a VLAN-aware bridge for IPv6") Reported-by: Paolo Abeni <pabeni@redhat.com> Closes: https://lore.kernel.org/netdev/24a7051fdcd1f156c3704bca39e4b3c41dfc7c4b.camel@redhat.com/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240320065717.4145325-1-idosch@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-21spi: spi-mt65xx: Fix NULL pointer access in interrupt handlerFei Shao
The TX buffer in spi_transfer can be a NULL pointer, so the interrupt handler may end up writing to the invalid memory and cause crashes. Add a check to trans->tx_buf before using it. Fixes: 1ce24864bff4 ("spi: mediatek: Only do dma for 4-byte aligned buffers") Signed-off-by: Fei Shao <fshao@chromium.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://msgid.link/r/20240321070942.1587146-2-fshao@chromium.org Signed-off-by: Mark Brown <broonie@kernel.org>
2024-03-21MAINTAINERS: step down as netfilter maintainerFlorian Westphal
I do not feel that I'm up to the task anymore. I hope this to be a temporary emergeny measure, but for now I'm sure this is the best course of action for me. Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/20240319121223.24474-1-fw@strlen.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-21sh: hd64461: Make setup_hd64461() staticArtur Rojek
Enforce internal linkage for setup_hd64461(). This fixes the following error: arch/sh/cchips/hd6446x/hd64461.c:75:12: error: no previous prototype for 'setup_hd64461' [-Werror=missing-prototypes] Signed-off-by: Artur Rojek <contact@artur-rojek.eu> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Link: https://lore.kernel.org/r/20240211193451.106795-1-contact@artur-rojek.eu Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
2024-03-21netfilter: nf_tables: Fix a memory leak in nf_tables_updchainQuan Tian
If nft_netdev_register_hooks() fails, the memory associated with nft_stats is not freed, causing a memory leak. This patch fixes it by moving nft_stats_alloc() down after nft_netdev_register_hooks() succeeds. Fixes: b9703ed44ffb ("netfilter: nf_tables: support for adding new devices to an existing netdev chain") Signed-off-by: Quan Tian <tianquan23@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-03-21Merge branch ↵Paolo Abeni
'mt7530-dsa-subdriver-fix-vlan-egress-and-handling-of-all-link-local-frames' says: ==================== MT7530 DSA subdriver fix VLAN egress and handling of all link-local frames This patch series fixes the VLAN tag egress procedure for link-local frames, and fixes handling of all link-local frames. Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com> ==================== Link: https://lore.kernel.org/r/20240314-b4-for-net-mt7530-fix-link-local-vlan-v2-0-7dbcf6429ba0@arinc9.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-21net: dsa: mt7530: fix handling of all link-local framesArınç ÜNAL
Currently, the MT753X switches treat frames with :01-0D and :0F MAC DAs as regular multicast frames, therefore flooding them to user ports. On page 205, section "8.6.3 Frame filtering" of the active standard, IEEE Std 802.1Q™-2022, it is stated that frames with 01:80:C2:00:00:00-0F as MAC DA must only be propagated to C-VLAN and MAC Bridge components. That means VLAN-aware and VLAN-unaware bridges. On the switch designs with CPU ports, these frames are supposed to be processed by the CPU (software). So we make the switch only forward them to the CPU port. And if received from a CPU port, forward to a single port. The software is responsible of making the switch conform to the latter by setting a single port as destination port on the special tag. This switch intellectual property cannot conform to this part of the standard fully. Whilst the REV_UN frame tag covers the remaining :04-0D and :0F MAC DAs, it also includes :22-FF which the scope of propagation is not supposed to be restricted for these MAC DAs. Set frames with :01-03 MAC DAs to be trapped to the CPU port(s). Add a comment for the remaining MAC DAs. Note that the ingress port must have a PVID assigned to it for the switch to forward untagged frames. A PVID is set by default on VLAN-aware and VLAN-unaware ports. However, when the network interface that pertains to the ingress port is attached to a vlan_filtering enabled bridge, the user can remove the PVID assignment from it which would prevent the link-local frames from being trapped to the CPU port. I am yet to see a way to forward link-local frames while preventing other untagged frames from being forwarded too. Fixes: b8f126a8d543 ("net-next: dsa: add dsa support for Mediatek MT7530 switch") Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-21net: dsa: mt7530: fix link-local frames that ingress vlan filtering portsArınç ÜNAL
Whether VLAN-aware or not, on every VID VLAN table entry that has the CPU port as a member of it, frames are set to egress the CPU port with the VLAN tag stacked. This is so that VLAN tags can be appended after hardware special tag (called DSA tag in the context of Linux drivers). For user ports on a VLAN-unaware bridge, frame ingressing the user port egresses CPU port with only the special tag. For user ports on a VLAN-aware bridge, frame ingressing the user port egresses CPU port with the special tag and the VLAN tag. This causes issues with link-local frames, specifically BPDUs, because the software expects to receive them VLAN-untagged. There are two options to make link-local frames egress untagged. Setting CONSISTENT or UNTAGGED on the EG_TAG bits on the relevant register. CONSISTENT means frames egress exactly as they ingress. That means egressing with the VLAN tag they had at ingress or egressing untagged if they ingressed untagged. Although link-local frames are not supposed to be transmitted VLAN-tagged, if they are done so, when egressing through a CPU port, the special tag field will be broken. BPDU egresses CPU port with VLAN tag egressing stacked, received on software: 00:01:25.104821 AF Unknown (382365846), length 106: | STAG | | VLAN | 0x0000: 0000 6c27 614d 4143 0001 0000 8100 0001 ..l'aMAC........ 0x0010: 0026 4242 0300 0000 0000 0000 6c27 614d .&BB........l'aM 0x0020: 4143 0000 0000 0000 6c27 614d 4143 0000 AC......l'aMAC.. 0x0030: 0000 1400 0200 0f00 0000 0000 0000 0000 ................ BPDU egresses CPU port with VLAN tag egressing untagged, received on software: 00:23:56.628708 AF Unknown (25215488), length 64: | STAG | 0x0000: 0000 6c27 614d 4143 0001 0000 0026 4242 ..l'aMAC.....&BB 0x0010: 0300 0000 0000 0000 6c27 614d 4143 0000 ........l'aMAC.. 0x0020: 0000 0000 6c27 614d 4143 0000 0000 1400 ....l'aMAC...... 0x0030: 0200 0f00 0000 0000 0000 0000 ............ BPDU egresses CPU port with VLAN tag egressing tagged, received on software: 00:01:34.311963 AF Unknown (25215488), length 64: | Mess | 0x0000: 0000 6c27 614d 4143 0001 0001 0026 4242 ..l'aMAC.....&BB 0x0010: 0300 0000 0000 0000 6c27 614d 4143 0000 ........l'aMAC.. 0x0020: 0000 0000 6c27 614d 4143 0000 0000 1400 ....l'aMAC...... 0x0030: 0200 0f00 0000 0000 0000 0000 ............ To prevent confusing the software, force the frame to egress UNTAGGED instead of CONSISTENT. This way, frames can't possibly be received TAGGED by software which would have the special tag field broken. VLAN Tag Egress Procedure For all frames, one of these options set the earliest in this order will apply to the frame: - EG_TAG in certain registers for certain frames. This will apply to frame with matching MAC DA or EtherType. - EG_TAG in the address table. This will apply to frame at its incoming port. - EG_TAG in the PVC register. This will apply to frame at its incoming port. - EG_CON and [EG_TAG per port] in the VLAN table. This will apply to frame at its outgoing port. - EG_TAG in the PCR register. This will apply to frame at its outgoing port. EG_TAG in certain registers for certain frames: PPPoE Discovery_ARP/RARP: PPP_EG_TAG and ARP_EG_TAG in the APC register. IGMP_MLD: IGMP_EG_TAG and MLD_EG_TAG in the IMC register. BPDU and PAE: BPDU_EG_TAG and PAE_EG_TAG in the BPC register. REV_01 and REV_02: R01_EG_TAG and R02_EG_TAG in the RGAC1 register. REV_03 and REV_0E: R03_EG_TAG and R0E_EG_TAG in the RGAC2 register. REV_10 and REV_20: R10_EG_TAG and R20_EG_TAG in the RGAC3 register. REV_21 and REV_UN: R21_EG_TAG and RUN_EG_TAG in the RGAC4 register. With this change, it can be observed that a bridge interface with stp_state and vlan_filtering enabled will properly block ports now. Fixes: b8f126a8d543 ("net-next: dsa: add dsa support for Mediatek MT7530 switch") Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-21arm64: bpf: fix 32bit unconditional bswapArtem Savkov
In case when is64 == 1 in emit(A64_REV32(is64, dst, dst), ctx) the generated insn reverses byte order for both high and low 32-bit words, resuling in an incorrect swap as indicated by the jit test: [ 9757.262607] test_bpf: #312 BSWAP 16: 0x0123456789abcdef -> 0xefcd jited:1 8 PASS [ 9757.264435] test_bpf: #313 BSWAP 32: 0x0123456789abcdef -> 0xefcdab89 jited:1 ret 1460850314 != -271733879 (0x5712ce8a != 0xefcdab89)FAIL (1 times) [ 9757.266260] test_bpf: #314 BSWAP 64: 0x0123456789abcdef -> 0x67452301 jited:1 8 PASS [ 9757.268000] test_bpf: #315 BSWAP 64: 0x0123456789abcdef >> 32 -> 0xefcdab89 jited:1 8 PASS [ 9757.269686] test_bpf: #316 BSWAP 16: 0xfedcba9876543210 -> 0x1032 jited:1 8 PASS [ 9757.271380] test_bpf: #317 BSWAP 32: 0xfedcba9876543210 -> 0x10325476 jited:1 ret -1460850316 != 271733878 (0xa8ed3174 != 0x10325476)FAIL (1 times) [ 9757.273022] test_bpf: #318 BSWAP 64: 0xfedcba9876543210 -> 0x98badcfe jited:1 7 PASS [ 9757.274721] test_bpf: #319 BSWAP 64: 0xfedcba9876543210 >> 32 -> 0x10325476 jited:1 9 PASS Fix this by forcing 32bit variant of rev32. Fixes: 1104247f3f979 ("bpf, arm64: Support unconditional bswap") Signed-off-by: Artem Savkov <asavkov@redhat.com> Tested-by: Puranjay Mohan <puranjay12@gmail.com> Acked-by: Puranjay Mohan <puranjay12@gmail.com> Acked-by: Xu Kuohai <xukuohai@huawei.com> Message-ID: <20240321081809.158803-1-asavkov@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-03-21x86/config: Fix warning for 'make ARCH=x86_64 tinyconfig'Masahiro Yamada
Kconfig emits a warning for the following command: $ make ARCH=x86_64 tinyconfig ... .config:1380:warning: override: UNWINDER_GUESS changes choice state When X86_64=y, the unwinder is exclusively selected from the following three options: - UNWINDER_ORC - UNWINDER_FRAME_POINTER - UNWINDER_GUESS However, arch/x86/configs/tiny.config only specifies the values of the last two. UNWINDER_ORC must be explicitly disabled. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240320154313.612342-1-masahiroy@kernel.org
2024-03-21drm/i915/mst: enable MST mode for 128b/132b single-stream sidebandJani Nikula
If the sink supports 128b/132b and single-stream sideband messaging, enable MST mode. With this, the topology manager will still write DP_MSTM_CTRL, which should be ignored by the sink. In the future, the topology manager should probably only set the sideband messaging related parts of the register. Cc: Arun R Murthy <arun.r.murthy@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/39d753e53cd662c3fd3776b6167bf792219fd950.1710839496.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-03-21drm/i915/mst: add intel_dp_mst_disconnect()Jani Nikula
Abstract the MST mode disconnect to a separate function. Cc: Arun R Murthy <arun.r.murthy@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/c39239fb6bef87a89219c8fbe7799f97f91b9042.1710839496.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-03-21drm/i915/mst: use the MST mode detected previouslyJani Nikula
Drop the duplicate read of DP_MSTM_CAP DPCD register, and the duplicate logic for choosing MST mode, and store the chosen mode in struct intel_dp. Rename intel_dp_configure_mst() to intel_dp_mst_configure() while at it. v2: Rebase on drm_dp_mst_detect() returning the mode, not bool Cc: Arun R Murthy <arun.r.murthy@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/93a48df9a77e1138bb28e645fae3f9c79b094cc7.1710839496.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-03-21drm/i915/mst: abstract choosing the MST mode to useJani Nikula
Clarify the conditions for choosing the MST mode to use by adding a new function intel_dp_mst_mode_choose(). This also prepares for being able to extend the MST modes to single-stream sideband messaging. Cc: Arun R Murthy <arun.r.murthy@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/f626144f10b03d4609ff38a29bac013ecf3aca4e.1710839496.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-03-21drm/i915/mst: improve debug logging of DP MST mode detectJani Nikula
Rename intel_dp_can_mst() to intel_dp_mst_detect(), and move all DP MST detect debug logging there. Debug log the sink's MST capability, including single-stream sideband messaging support, and the decision whether to enable MST mode or not. Do this regardless of whether we're actually enabling MST or not. We need to detect MST in intel_dp_detect_dpcd() before the earlier returns, but try not to change the logic otherwise. v2: - Use "MST", "SST w/ sideband messaging", and "SST" for logging (Ville) - Return MST mode from intel_dp_mst_detect() - Do MST detect before early returns from intel_dp_detect_dpcd() Cc: Arun R Murthy <arun.r.murthy@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/db08536daec0a6062539319d71c10ee1277e3876.1710839496.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-03-21drm/mst: read sideband messaging capJani Nikula
Amend drm_dp_read_mst_cap() to return an enum, indicating "SST", "SST with sideband messaging", or "MST". Modify all call sites to take the new return value into account. v2: - Rename enumerators (Ville) Cc: Arun R Murthy <arun.r.murthy@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Karol Herbst <kherbst@redhat.com> Cc: Lyude Paul <lyude@redhat.com> Cc: Danilo Krummrich <dakr@redhat.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Acked-by: Maxime Ripard <mripard@kernel.org> Acked-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/b32a3704934871a67d06420b760e148b76c5ced8.1710839496.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-03-20Merge branch 'report-rcu-qs-for-busy-network-kthreads'Jakub Kicinski
Yan Zhai says: ==================== Report RCU QS for busy network kthreads This changeset fixes a common problem for busy networking kthreads. These threads, e.g. NAPI threads, typically will do: * polling a batch of packets * if there are more work, call cond_resched() to allow scheduling * continue to poll more packets when rx queue is not empty We observed this being a problem in production, since it can block RCU tasks from making progress under heavy load. Investigation indicates that just calling cond_resched() is insufficient for RCU tasks to reach quiescent states. This also has the side effect of frequently clearing the TIF_NEED_RESCHED flag on voluntary preempt kernels. As a result, schedule() will not be called in these circumstances, despite schedule() in fact provides required quiescent states. This at least affects NAPI threads, napi_busy_loop, and also cpumap kthread. By reporting RCU QSes in these kthreads periodically before cond_resched, the blocked RCU waiters can correctly progress. Instead of just reporting QS for RCU tasks, these code share the same concern as noted in the commit d28139c4e967 ("rcu: Apply RCU-bh QSes to RCU-sched and RCU-preempt when safe"). So report a consolidated QS for safety. It is worth noting that, although this problem is reproducible in napi_busy_loop, it only shows up when setting the polling interval to as high as 2ms, which is far larger than recommended 50us-100us in the documentation. So napi_busy_loop is left untouched. Lastly, this does not affect RT kernels, which does not enter the scheduler through cond_resched(). Without the mentioned side effect, schedule() will be called time by time, and clear the RCU task holdouts. V4: https://lore.kernel.org/bpf/cover.1710525524.git.yan@cloudflare.com/ V3: https://lore.kernel.org/lkml/20240314145459.7b3aedf1@kernel.org/t/ V2: https://lore.kernel.org/bpf/ZeFPz4D121TgvCje@debian.debian/ V1: https://lore.kernel.org/lkml/Zd4DXTyCf17lcTfq@debian.debian/#t ==================== Link: https://lore.kernel.org/r/cover.1710877680.git.yan@cloudflare.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>