summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2025-03-08ASoC: SOF: ipc4: Add support for Intel HW managed mic privacy messagingPeter Ujfalusi
ACE3 (Panther Lake) introduced support for microphone privacy feature which can - in hardware - mute incoming audio data based on a state of a physical switch. The change in the privacy state is delivered through interface IP blocks and can only be handled by the link owner. In Intel platforms Soundwire is for example host owned, so the interrupt can only be handled by the host. Since the input stream is going to be muted by hardware, the host needs to send a message to firmware about the change in privacy so it can execute a fade out/in to enhance user experience. The support for microphone privacy can be queried from the HW_CONFIG data under the INTEL_MIC_PRIVACY_CAP tuple. This is Intel specific data, the core will pass it to platform code if the intel_configure_mic_privacy() callback is provided. Platform code can call sof_ipc4_mic_privacy_state_change() to send the IPC message to the firmware on state change. Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com> Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Link: https://patch.msgid.link/20250307112816.1495-6-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-08PCI: endpoint: Remove unused devm_pci_epc_destroy()Zijun Hu
The static function devm_pci_epc_match() is only invoked within the devm_pci_epc_destroy(). However, since it was initially introduced, this new API has had no callers. Thus, remove both the unused API and the static function. Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20250217-remove_api-v2-1-b169c9117045@quicinc.com Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-08PCI: endpoint: Add pci_epc_bar_size_to_rebar_cap()Niklas Cassel
Add a helper function to convert a size to the representation used by the Resizable BAR Capability Register. Signed-off-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20250131182949.465530-11-cassel@kernel.org [mani: squashed the change that added PCIe spec reference to comments from https://lore.kernel.org/linux-pci/20250219171454.2903059-2-cassel@kernel.org] Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-08PCI: endpoint: Allow EPF drivers to configure the size of Resizable BARsNiklas Cassel
A resizable BAR is different from a normal BAR in a few ways: - The minimum size of a resizable BAR is 1 MB. - Each BAR that is resizable has a Capability and Control register in the Resizable BAR Capability structure. These registers contain the supported sizes and the currently selected size of a resizable BAR. The supported sizes is a bitmap of the supported sizes. The selected size is a single value that is equal to one of the supported sizes. A resizable BAR thus has to be configured differently than a BAR_PROGRAMMABLE BAR, which usually sets the BAR size/mask in a vendor specific way. The PCI endpoint framework currently does not support resizable BARs. Add a BAR type BAR_RESIZABLE, so that an EPC driver can support resizable BARs properly. Note that the pci_epc_set_bar() API takes a struct pci_epf_bar which tells the EPC driver how it wants to configure the BAR. struct pci_epf_bar only has a single size struct member. This means that an EPC driver will only be able to set a single supported size. This is perfectly fine, as we do not need the complexity of allowing a host to change the size of the BAR. If someone ever wants to support resizing a resizable BAR, the pci_epc_set_bar() API can be extended in the future. With these changes, we allow an EPF driver to configure the size of Resizable BARs, rather than forcing them to a 1 MB size. Signed-off-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20250131182949.465530-10-cassel@kernel.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2025-03-08vdso: Rework struct vdso_time_data and introduce struct vdso_clockAnna-Maria Behnsen
To support multiple PTP clocks, the VDSO data structure needs to be reworked. All clock specific data will end up in struct vdso_clock and in struct vdso_time_data there will be an array of VDSO clocks. Now that all preparatory changes are in place: Split the clock related struct members into a separate struct vdso_clock. Make sure all users are aware, that vdso_time_data is no longer initialized as an array and vdso_clock is now the array inside vdso_data. Remove the vdso_clock define, which mapped it to vdso_time_data for the transition. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250303-vdso-clock-v1-19-c1b5c69a166f@linutronix.de
2025-03-08vdso: Move architecture related data before basetime dataAnna-Maria Behnsen
Architecture related vdso data is required in the fast path when reading CLOCK_MONOTONIC or CLOCK_REALTIME. At the moment, this information is located at the end of the vdso_time_data structure, which is a suboptimal cache layout. Move the architecture specific VDSO data right before the basetime information, which is always required. This change does not have an impact on architectures with CONFIG_ARCH_HAS_VDSO_DATA=n. Architectures, which have it enabled, gain a better cache layout. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250303-vdso-clock-v1-18-c1b5c69a166f@linutronix.de
2025-03-08vdso/helpers: Prepare introduction of struct vdso_clockAnna-Maria Behnsen
To support multiple PTP clocks, the VDSO data structure needs to be reworked. All clock specific data will end up in struct vdso_clock and in struct vdso_time_data there will be an array of VDSO clocks. For now, vdso_clock is simply a define which maps vdso_clock to vdso_time_data. Prepare all functions which need the pointer to the vdso_clock array to work well after the structures get reworked. Replace the struct vdso_time_data pointer with a struct vdso_clock pointer where applicable. No functional change. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250303-vdso-clock-v1-5-c1b5c69a166f@linutronix.de
2025-03-08vdso/datapage: Define vdso_clock to prepare for multiple PTP clocksAnna-Maria Behnsen
Multiple PTP clocks, which are independent of timekeeping, are required for systems, which utilize PTP for synchronizing e.g. automation systems independent of clock TAI. PTP clocks are slow to access, but applications require fast access to the relevant time similar to the regular timekeeping relevant clocks. To prepare for that the VDSO data representation must be reworked. For transition to the new structure of the vdso, add a define which maps vdso_clock to vdso_data. This will be removed when all users are updated step by step. No functional change. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250303-vdso-clock-v1-4-c1b5c69a166f@linutronix.de
2025-03-08vdso: Make vdso_time_data cacheline alignedAnna-Maria Behnsen
vdso_time_data is not cacheline aligned at the moment. When instantiating an array, the start of the second array member is not cache line aligned. This increases the number of the required cache lines which needs to be read when handling e.g. CLOCK_MONOTONIC_RAW, because the data spawns an extra cache line if the previous data does not end at a cache line boundary. Therefore make struct vdso_time_data cacheline aligned. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250303-vdso-clock-v1-3-c1b5c69a166f@linutronix.de
2025-03-08vdso: Introduce vdso/cache.hThomas Weißschuh
The vDSO implementation can only include headers from the vdso/ namespace. To enable the usage of ____cacheline_aligned from the vDSO, move it and its dependencies into a new header vdso/cache.h. Keep compatibility by including vdso/cache.h from linux/cache.h. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250303-vdso-clock-v1-1-c1b5c69a166f@linutronix.de
2025-03-08vsprintf: add simple_strntoulDavid Disseldorp
cpio extraction currently does a memcpy to ensure that the archive hex fields are null terminated for simple_strtoul(). simple_strntoul() will allow us to avoid the memcpy. Signed-off-by: David Disseldorp <ddiss@suse.de> Link: https://lore.kernel.org/r/20250304061020.9815-4-ddiss@suse.de Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-08crypto: acomp - Remove acomp request flagsHerbert Xu
The acomp request flags field duplicates the base request flags and is confusing. Remove it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-08crypto: lzo - Fix compression buffer overrunHerbert Xu
Unlike the decompression code, the compression code in LZO never checked for output overruns. It instead assumes that the caller always provides enough buffer space, disregarding the buffer length provided by the caller. Add a safe compression interface that checks for the end of buffer before each write. Use the safe interface in crypto/lzo. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-08crypto: api - Move struct crypto_type into internal.hHerbert Xu
Move the definition of struct crypto_type into internal.h as it is only used by API implementors and not algorithm implementors. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-03-07capability: Remove unused has_capabilityDr. David Alan Gilbert
The vanilla has_capability() function has been unused since 2018's commit dcb569cf6ac9 ("Smack: ptrace capability use fixes") Remove it. Fixup a comment in security/commoncap.c that referenced it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Serge Hallyn <sergeh@kernel.org>
2025-03-07ubsan/overflow: Rework integer overflow sanitizer option to turn on everythingKees Cook
Since we're going to approach integer overflow mitigation a type at a time, we need to enable all of the associated sanitizers, and then opt into types one at a time. Rename the existing "signed wrap" sanitizer to just the entire topic area: "integer wrap". Enable the implicit integer truncation sanitizers, with required callbacks and tests. Notably, this requires features (currently) only available in Clang, so we can depend on the cc-option tests to determine availability instead of doing version tests. Link: https://lore.kernel.org/r/20250307041914.937329-1-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-03-07netpoll: Optimize skb refilling on critical pathBreno Leitao
netpoll tries to refill the skb queue on every packet send, independently if packets are being consumed from the pool or not. This was particularly problematic while being called from printk(), where the operation would be done while holding the console lock. Introduce a more intelligent approach to skb queue management. Instead of constantly attempting to refill the queue, the system now defers refilling to a work queue and only triggers the workqueue when a buffer is actually dequeued. This change significantly reduces operations with the lock held. Add a work_struct to the netpoll structure for asynchronous refilling, updating find_skb() to schedule refill work only when necessary (skb is dequeued). These changes have demonstrated a 15% reduction in time spent during netpoll_send_msg operations, especially when no SKBs are not consumed from consumed from pool. When SKBs are being dequeued, the improvement is even better, around 70%, mainly because refilling the SKB pool is now happening outside of the critical patch (with console_owner lock held). Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250304-netpoll_refill_v2-v1-1-06e2916a4642@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07tcp: ulp: diag: more info without CAP_NET_ADMINMatthieu Baerts (NGI0)
When introduced in commit 61723b393292 ("tcp: ulp: add functions to dump ulp-specific information"), the whole ULP diag info has been exported only if the requester had CAP_NET_ADMIN. It looks like not everything is sensitive, and some info can be exported to all users in order to ease the debugging from the userspace side without requiring additional capabilities. Each layer should then decide what can be exposed to everybody. The 'net_admin' boolean is then passed to the different layers. On kTLS side, it looks like there is nothing sensitive there: version, cipher type, tx/rx user config type, plus some flags. So, only some metadata about the configuration, no cryptographic info like keys, etc. Then, everything can be exported to all users. On MPTCP side, that's different. The MPTCP-related sequence numbers per subflow should certainly not be exposed to everybody. For example, the DSS mapping and ssn_offset would give all users on the system access to narrow ranges of values for the subflow TCP sequence numbers and MPTCP-level DSNs, and then ease packet injection. The TCP diag interface doesn't expose the TCP sequence numbers for TCP sockets, so best to do the same here. The rest -- token, IDs, flags -- can be exported to everybody. Acked-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250306-net-next-tcp-ulp-diag-net-admin-v1-2-06afdd860fc9@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07net: phylink: Remove unused phylink_init_eeeDr. David Alan Gilbert
phylink_init_eee() is currently unused. It was last added in 2019 by commit 86e58135bc4a ("net: phylink: add phylink_init_eee() helper") but it didn't actually wire a use up. It had previous been removed in 2017 by commit 939eae25d9a5 ("phylink: remove phylink_init_eee()"). Remove it again. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20250306184534.246152-1-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-08power: supply: core: get rid of of_nodeSebastian Reichel
This removes .of_node from 'struct power_supply', since there is already a copy in .dev.of_node and there is no need to have two copies. Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20250225-psy-core-convert-to-fwnode-v1-1-d5e4369936bb@collabora.com Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2025-03-08power: supply: Remove unused set_charged methodDr. David Alan Gilbert
The previous patches in this series removed the only caller and only setter of this method. Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Link: https://lore.kernel.org/r/20250307230225.128775-4-linux@treblig.org Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2025-03-08power: supply: core: Remove unused power_supply_set_battery_chargedDr. David Alan Gilbert
power_supply_set_battery_charged() has been unused since 2019's commit 0f884f8a090e ("ARM: pxa: remove raumfeld board files and defconfig") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Link: https://lore.kernel.org/r/20250307230225.128775-2-linux@treblig.org Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2025-03-08counter: microchip-tcb-capture: Add capture extensions for registers RA/RBBence Csókás
TCB hardware is capable of capturing the timer value to registers RA and RB. Add these registers as capture extensions. Signed-off-by: Bence Csókás <csokas.bence@prolan.hu> Link: https://lore.kernel.org/r/20250306134441.582819-3-csokas.bence@prolan.hu Signed-off-by: William Breathitt Gray <wbg@kernel.org>
2025-03-08counter: microchip-tcb-capture: Add IRQ handlingBence Csókás
Add interrupt servicing to allow userspace to wait for the following: * Change-of-state caused by external trigger * Capture of timer value into RA/RB * Compare to RC register * Overflow Signed-off-by: Bence Csókás <csokas.bence@prolan.hu> Link: https://lore.kernel.org/r/20250306134441.582819-2-csokas.bence@prolan.hu Signed-off-by: William Breathitt Gray <wbg@kernel.org>
2025-03-07Merge tag 'acpi-6.14-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "Restore the previous behavior of the ACPI platform_profile sysfs interface that has been changed recently in a way incompatible with the existing user space (Mario Limonciello)" * tag 'acpi-6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: platform/x86/amd: pmf: Add balanced-performance to hidden choices platform/x86/amd: pmf: Add 'quiet' to hidden choices ACPI: platform_profile: Add support for hidden choices
2025-03-07Merge tag 'block-6.14-20250306' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: - NVMe pull request via Keith: - TCP use after free fix on polling (Sagi) - Controller memory buffer cleanup fixes (Icenowy) - Free leaking requests on bad user passthrough commands (Keith) - TCP error message fix (Maurizio) - TCP corruption fix on partial PDU (Maurizio) - TCP memory ordering fix for weakly ordered archs (Meir) - Type coercion fix on message error for TCP (Dan) - Name the RQF flags enum, fixing issues with anon enums and BPF import of it - ublk parameter setting fix - GPT partition 7-bit conversion fix * tag 'block-6.14-20250306' of git://git.kernel.dk/linux: block: Name the RQF flags enum nvme-tcp: fix signedness bug in nvme_tcp_init_connection() block: fix conversion of GPT partition name to 7-bit ublk: set_params: properly check if parameters can be applied nvmet-tcp: Fix a possible sporadic response drops in weakly ordered arch nvme-tcp: fix potential memory corruption in nvme_tcp_recv_pdu() nvme-tcp: Fix a C2HTermReq error message nvmet: remove old function prototype nvme-ioctl: fix leaked requests on mapping error nvme-pci: skip CMB blocks incompatible with PCI P2P DMA nvme-pci: clean up CMBMSC when registering CMB fails nvme-tcp: fix possible UAF in nvme_tcp_poll
2025-03-07drm/amdkfd: Add support for more per-process flagHarish Kasiviswanathan
Add support for more per-process flags starting with option to configure MFMA precision for gfx 9.5 v2: Change flag name to KFD_PROC_FLAG_MFMA_HIGH_PRECISION Remove unused else condition v3: Bump the KFD API version v4: Missed SH_MEM_CONFIG__PRECISION_MODE__SHIFT define. Added it. Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Amber Lin <Amber.Lin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-07elf: add remaining SHF_ flag macrosTimur Tabi
Add the remaining SHF_ flags, as listed in the "Executable and Linkable Format" Wikipedia page and the System V Application Binary Interface[1]. This allows drivers to load and parse ELF images that use some of those flags. In particular, an upcoming change to the Nouveau GPU driver will use some of the flags. Link: https://refspecs.linuxfoundation.org/elf/gabi4+/ch4.sheader.html#sh_flags [1] Signed-off-by: Timur Tabi <ttabi@nvidia.com> Link: https://lore.kernel.org/r/20250307171417.267488-1-ttabi@nvidia.com Signed-off-by: Kees Cook <kees@kernel.org>
2025-03-07Revert "Bluetooth: hci_core: Fix sleeping function called from invalid context"Luiz Augusto von Dentz
This reverts commit 4d94f05558271654670d18c26c912da0c1c15549 which has problems (see [1]) and is no longer needed since 581dd2dc168f ("Bluetooth: hci_event: Fix using rcu_read_(un)lock while iterating") has reworked the code where the original bug has been found. [1] Link: https://lore.kernel.org/linux-bluetooth/877c55ci1r.wl-tiwai@suse.de/T/#t Fixes: 4d94f0555827 ("Bluetooth: hci_core: Fix sleeping function called from invalid context") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-03-07PCI: Increase Resizable BAR support from 512 GB to 128 TBZhiyuan Dai
Per PCIe r6.0, sec 7.8.6.2, devices can advertise Resizable BAR sizes up to 128 TB in the Resizable BAR Capability register. Larger sizes can be advertised via the Capability register, but that requires an API change. Update pci_rebar_get_possible_sizes() and pbus_size_mem() to increase the sizes we currently support from 512 GB to 128 TB. Link: https://lore.kernel.org/r/20250307053535.44918-1-daizhiyuan@phytium.com.cn Signed-off-by: Zhiyuan Dai <daizhiyuan@phytium.com.cn> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-07io_uring/rw: defer reg buf vec importPavel Begunkov
Import registered buffers for vectored reads and writes later at issue time as we now do for other fixed ops. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/e8491c976e4ab83a4e3dc428e9fe7555e59583b8.1741362889.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-07io_uring/rw: implement vectored registered rwPavel Begunkov
Implement registered buffer vectored reads with new opcodes IORING_OP_WRITEV_FIXED and IORING_OP_READV_FIXED. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d7c89eb481e870f598edc91cc66ff4d1e4ae3788.1741362889.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-07io_uring: add infra for importing vectored reg buffersPavel Begunkov
Add io_import_reg_vec(), which will be responsible for importing vectored registered buffers. The function might reallocate the vector, but it'd try to do the conversion in place first, which is why it's required of the user to pad the iovec to the right border of the cache. Overlapping also depends on struct iovec being larger than bvec, which is not the case on e.g. 32 bit architectures. Don't try to complicate this case and make sure vectors never overlap, it'll be improved later. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/60bd246b1249476a6996407c1dbc38ef6febad14.1741362889.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-07io_uring: introduce struct iou_vecPavel Begunkov
I need a convenient way to pass around and work with iovec+size pair, put them into a structure and makes use of it in rw.c Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d39fadafc9e9047b0a292e5be6db3cf2f48bb1f7.1741362889.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-07Merge branch 'for-6.15/io_uring-epoll-wait' into for-6.15/io_uring-reg-vecJens Axboe
* for-6.15/io_uring-epoll-wait: io_uring/epoll: add support for IORING_OP_EPOLL_WAIT io_uring/epoll: remove CONFIG_EPOLL guards eventpoll: add epoll_sendevents() helper eventpoll: abstract out ep_try_send_events() helper eventpoll: abstract out parameter sanity checking
2025-03-07Merge branch 'for-6.15/io_uring-rx-zc' into for-6.15/io_uring-reg-vecJens Axboe
* for-6.15/io_uring-rx-zc: (80 commits) io_uring/zcrx: add selftest case for recvzc with read limit io_uring/zcrx: add a read limit to recvzc requests io_uring: add missing IORING_MAP_OFF_ZCRX_REGION in io_uring_mmap io_uring: Rename KConfig to Kconfig io_uring/zcrx: fix leaks on failed registration io_uring/zcrx: recheck ifq on shutdown io_uring/zcrx: add selftest net: add documentation for io_uring zcrx io_uring/zcrx: add copy fallback io_uring/zcrx: throttle receive requests io_uring/zcrx: set pp memory provider for an rx queue io_uring/zcrx: add io_recvzc request io_uring/zcrx: dma-map area for the device io_uring/zcrx: implement zerocopy receive pp memory provider io_uring/zcrx: grab a net device io_uring/zcrx: add io_zcrx_area io_uring/zcrx: add interface queue and refill queue net: add helpers for setting a memory provider on an rx queue net: page_pool: add memory provider helpers net: prepare for non devmem TCP memory providers ...
2025-03-07Merge branch 'for-6.15/io_uring' into for-6.15/io_uring-reg-vecJens Axboe
* for-6.15/io_uring: (80 commits) io_uring: introduce io_cache_free() helper io_uring/rsrc: skip NULL file/buffer checks in io_free_rsrc_node() io_uring/rsrc: avoid NULL node check on io_sqe_buffer_register() failure io_uring/rsrc: call io_free_node() on io_sqe_buffer_register() failure io_uring/rsrc: free io_rsrc_node using kfree() io_uring/rsrc: split out io_free_node() helper io_uring/rsrc: include io_uring_types.h in rsrc.h ublk: don't cast registered buffer index to int io_uring/nop: use io_find_buf_node() io_uring/rsrc: declare io_find_buf_node() in header file io_uring/ublk: report error when unregister operation fails io_uring: convert cmd_to_io_kiocb() macro to function io_uring/uring_cmd: specify io_uring_cmd_import_fixed() pointer type io_uring/rsrc: use rq_data_dir() to compute bvec dir selftests: ublk: add ublk zero copy test selftests: ublk: add file backed ublk selftests: ublk: add kernel selftests for ublk io_uring: cache nodes and mapped buffers ublk: zc register/unregister bvec io_uring: add support for kernel registered bvecs ...
2025-03-07PM: EM: Address RCU-related sparse warningsRafael J. Wysocki
The usage of __rcu in the Energy Model code is quite inconsistent which causes the following sparse warnings to trigger: kernel/power/energy_model.c:169:15: warning: incorrect type in assignment (different address spaces) kernel/power/energy_model.c:169:15: expected struct em_perf_table [noderef] __rcu *table kernel/power/energy_model.c:169:15: got struct em_perf_table * kernel/power/energy_model.c:171:9: warning: incorrect type in argument 1 (different address spaces) kernel/power/energy_model.c:171:9: expected struct callback_head *head kernel/power/energy_model.c:171:9: got struct callback_head [noderef] __rcu * kernel/power/energy_model.c:171:9: warning: cast removes address space '__rcu' of expression kernel/power/energy_model.c:182:19: warning: incorrect type in argument 1 (different address spaces) kernel/power/energy_model.c:182:19: expected struct kref *kref kernel/power/energy_model.c:182:19: got struct kref [noderef] __rcu * kernel/power/energy_model.c:200:15: warning: incorrect type in assignment (different address spaces) kernel/power/energy_model.c:200:15: expected struct em_perf_table [noderef] __rcu *table kernel/power/energy_model.c:200:15: got void *[assigned] _res kernel/power/energy_model.c:204:20: warning: incorrect type in argument 1 (different address spaces) kernel/power/energy_model.c:204:20: expected struct kref *kref kernel/power/energy_model.c:204:20: got struct kref [noderef] __rcu * kernel/power/energy_model.c:320:19: warning: incorrect type in argument 1 (different address spaces) kernel/power/energy_model.c:320:19: expected struct kref *kref kernel/power/energy_model.c:320:19: got struct kref [noderef] __rcu * kernel/power/energy_model.c:325:45: warning: incorrect type in argument 2 (different address spaces) kernel/power/energy_model.c:325:45: expected struct em_perf_state *table kernel/power/energy_model.c:325:45: got struct em_perf_state [noderef] __rcu * kernel/power/energy_model.c:425:45: warning: incorrect type in argument 3 (different address spaces) kernel/power/energy_model.c:425:45: expected struct em_perf_state *table kernel/power/energy_model.c:425:45: got struct em_perf_state [noderef] __rcu * kernel/power/energy_model.c:442:15: warning: incorrect type in argument 1 (different address spaces) kernel/power/energy_model.c:442:15: expected void const *objp kernel/power/energy_model.c:442:15: got struct em_perf_table [noderef] __rcu *[assigned] em_table kernel/power/energy_model.c:626:55: warning: incorrect type in argument 2 (different address spaces) kernel/power/energy_model.c:626:55: expected struct em_perf_state *table kernel/power/energy_model.c:626:55: got struct em_perf_state [noderef] __rcu * kernel/power/energy_model.c:681:16: warning: incorrect type in assignment (different address spaces) kernel/power/energy_model.c:681:16: expected struct em_perf_state *new_ps kernel/power/energy_model.c:681:16: got struct em_perf_state [noderef] __rcu * kernel/power/energy_model.c:699:37: warning: incorrect type in argument 2 (different address spaces) kernel/power/energy_model.c:699:37: expected struct em_perf_state *table kernel/power/energy_model.c:699:37: got struct em_perf_state [noderef] __rcu * kernel/power/energy_model.c:733:38: warning: incorrect type in argument 3 (different address spaces) kernel/power/energy_model.c:733:38: expected struct em_perf_state *table kernel/power/energy_model.c:733:38: got struct em_perf_state [noderef] __rcu * kernel/power/energy_model.c:855:53: warning: dereference of noderef expression kernel/power/energy_model.c:864:32: warning: dereference of noderef expression This is because the __rcu annotation for sparse is only applicable to pointers that need rcu_dereference() or equivalent for protection, which basically means pointers assigned with rcu_assign_pointer(). Make all of the above sparse warnings go away by cleaning up the usage of __rcu and using rcu_dereference_protected() where applicable. Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/5885405.DvuYhMxLoT@rjwysocki.net
2025-03-07bdev: add back PAGE_SIZE block size validation for sb_set_blocksize()Luis Chamberlain
The commit titled "block/bdev: lift block size restrictions to 64k" lifted the block layer's max supported block size to 64k inside the helper blk_validate_block_size() now that we support large folios. However in lifting the block size we also removed the silly use cases many filesystems have to use sb_set_blocksize() to *verify* that the block size <= PAGE_SIZE. The call to sb_set_blocksize() was used to check the block size <= PAGE_SIZE since historically we've always supported userspace to create for example 64k block size filesystems even on 4k page size systems, but what we didn't allow was mounting them. Older filesystems have been using the check with sb_set_blocksize() for years. While, we could argue that such checks should be filesystem specific, there are much more users of sb_set_blocksize() than LBS enabled filesystem on upstream, so just do the easier thing and bring back the PAGE_SIZE check for sb_set_blocksize() users and only skip it for LBS enabled filesystems. This will ensure that tests such as generic/466 when run in a loop against say, ext4, won't try to try to actually mount a filesystem with a block size larger than your filesystem supports given your PAGE_SIZE and in the worst case crash. Cc: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20250307020403.3068567-1-mcgrof@kernel.org Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-06fs/pipe: add simpler helpers for common casesLinus Torvalds
The fix to atomically read the pipe head and tail state when not holding the pipe mutex has caused a number of headaches due to the size change of the involved types. It turns out that we don't have _that_ many places that access these fields directly and were affected, but we have more than we strictly should have, because our low-level helper functions have been designed to have intimate knowledge of how the pipes work. And as a result, that random noise of direct 'pipe->head' and 'pipe->tail' accesses makes it harder to pinpoint any actual potential problem spots remaining. For example, we didn't have a "is the pipe full" helper function, but instead had a "given these pipe buffer indexes and this pipe size, is the pipe full". That's because some low-level pipe code does actually want that much more complicated interface. But most other places literally just want a "is the pipe full" helper, and not having it meant that those places ended up being unnecessarily much too aware of this all. It would have been much better if only the very core pipe code that cared had been the one aware of this all. So let's fix it - better late than never. This just introduces the trivial wrappers for "is this pipe full or empty" and to get how many pipe buffers are used, so that instead of writing if (pipe_full(pipe->head, pipe->tail, pipe->max_usage)) the places that literally just want to know if a pipe is full can just say if (pipe_is_full(pipe)) instead. The existing trivial cases were converted with a 'sed' script. This cuts down on the places that access pipe->head and pipe->tail directly outside of the pipe code (and core splice code) quite a lot. The splice code in particular still revels in doing the direct low-level accesses, and the fuse fuse_dev_splice_write() code also seems a bit unnecessarily eager to go very low-level, but it's at least a bit better than it used to be. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-03-06Merge tag 'nf-25-03-06' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Fix racy non-atomic read-then-increment operation with PREEMPT_RT in nft_ct, from Sebastian Andrzej Siewior. 2) GC is not skipped when jiffies wrap around in nf_conncount, from Nicklas Bo Jensen. 3) flush_work() on nf_tables_destroy_work waits for the last queued instance, this could be an instance that is different from the one that we must wait for, then make destruction work queue. * tag 'nf-25-03-06' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_tables: make destruction work queue pernet netfilter: nf_conncount: garbage collection is not skipped when jiffies wrap around netfilter: nft_ct: Use __refcount_inc() for per-CPU nft_ct_pcpu_template. ==================== Link: https://patch.msgid.link/20250306153446.46712-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-06net/mlx5: Relocate function declarations from port.h to mlx5_core.hShahar Shitrit
The port header is a general file under include, yet it contains declarations for functions that are either not exported or exported but not used outside the mlx5_core driver. To enhance code organization, we move these declarations to mlx5_core.h, where they are more appropriately scoped. This refactor removes unnecessary exported symbols and prevents unexported functions from being inadvertently referenced outside of the mlx5_core driver. Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250304160620.417580-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-06block: Name the RQF flags enumBreno Leitao
Commit 5f89154e8e9e3445f9b59 ("block: Use enum to define RQF_x bit indexes") converted the RQF flags to an anonymous enum, which was a beneficial change. This patch goes one step further by naming the enum as "rqf_flags". This naming enables exporting these flags to BPF clients, eliminating the need to duplicate these flags in BPF code. Instead, BPF clients can now access the same kernel-side values through CO:RE (Compile Once, Run Everywhere), as shown in this example: rqf_stats = bpf_core_enum_value(enum rqf_flags, __RQF_STATS) Suggested-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20250306-rqf_flags-v1-1-bbd64918b406@debian.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-07Merge tag 'drm-misc-next-2025-03-06' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for v6.15: Cross-subsystem Changes: base: - component: Provide helper to query bound status fbdev: - fbtft: Remove access to page->index Core Changes: - Fix usage of logging macros in several places gem: - Add test function for imported dma-bufs and use it in core and helpers - Avoid struct drm_gem_object.import_attach tests: - Fix lockdep warnings ttm: - Add helpers for TTM shrinker Driver Changes: adp: - Add support for Apple Touch Bar displays on M1/M2 amdxdna: - Fix interrupt handling appletbdrm: - Add support for Apple Touch Bar displays on x86 bridge: - synopsys: Add HDMI audio support - ti-sn65dsi83: Support negative DE polarity ipu-v3: - Remove unused code nouveau: - Avoid multiple -Wflex-array-member-not-at-end warnings panthor: - Fix CS_STATUS_ defines - Improve locking rockchip: - analogix_dp: Add eDP support - lvds: Improve logging - vop2: Improve HDMI mode handling; Add support for RK3576 - Fix shutdown - Support rk3562-mali xe: - Use TTM shrinker Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20250306130700.GA485504@linux.fritz.box
2025-03-06inet: call inet6_ehashfn() once from inet6_hash_connect()Eric Dumazet
inet6_ehashfn() being called from __inet6_check_established() has a big impact on performance, as shown in the Tested section. After prior patch, we can compute the hash for port 0 from inet6_hash_connect(), and derive each hash in __inet_hash_connect() from this initial hash: hash(saddr, lport, daddr, dport) == hash(saddr, 0, daddr, dport) + lport Apply the same principle for __inet_check_established(), although inet_ehashfn() has a smaller cost. Tested: Server: ulimit -n 40000; neper/tcp_crr -T 200 -F 30000 -6 --nolog Client: ulimit -n 40000; neper/tcp_crr -T 200 -F 30000 -6 --nolog -c -H server Before this patch: utime_start=0.286131 utime_end=4.378886 stime_start=11.952556 stime_end=1991.655533 num_transactions=1446830 latency_min=0.001061085 latency_max=12.075275028 latency_mean=0.376375302 latency_stddev=1.361969596 num_samples=306383 throughput=151866.56 perf top: 50.01% [kernel] [k] __inet6_check_established 20.65% [kernel] [k] __inet_hash_connect 15.81% [kernel] [k] inet6_ehashfn 2.92% [kernel] [k] rcu_all_qs 2.34% [kernel] [k] __cond_resched 0.50% [kernel] [k] _raw_spin_lock 0.34% [kernel] [k] sched_balance_trigger 0.24% [kernel] [k] queued_spin_lock_slowpath After this patch: utime_start=0.315047 utime_end=9.257617 stime_start=7.041489 stime_end=1923.688387 num_transactions=3057968 latency_min=0.003041375 latency_max=7.056589232 latency_mean=0.141075048 # Better latency metrics latency_stddev=0.526900516 num_samples=312996 throughput=320677.21 # 111 % increase, and 229 % for the series perf top: inet6_ehashfn is no longer seen. 39.67% [kernel] [k] __inet_hash_connect 37.06% [kernel] [k] __inet6_check_established 4.79% [kernel] [k] rcu_all_qs 3.82% [kernel] [k] __cond_resched 1.76% [kernel] [k] sched_balance_domains 0.82% [kernel] [k] _raw_spin_lock 0.81% [kernel] [k] sched_balance_rq 0.81% [kernel] [k] sched_balance_trigger 0.76% [kernel] [k] queued_spin_lock_slowpath Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Tested-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://patch.msgid.link/20250305034550.879255-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-06Merge branch 'sched/urgent' into sched/core, to pick up dependent commitsIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-03-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.14-rc6). Conflicts: net/ethtool/cabletest.c 2bcf4772e45a ("net: ethtool: try to protect all callback with netdev instance lock") 637399bf7e77 ("net: ethtool: netlink: Allow NULL nlattrs when getting a phy_device") No Adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-06eth: bnxt: remove most dependencies on RTNLStanislav Fomichev
Only devlink and sriov paths are grabbing rtnl explicitly. The rest is covered by netdev instance lock which the core now grabs, so there is no need to manage rtnl in most places anymore. On the core side we can now try to drop rtnl in some places (do_setlink for example) for the drivers that signal non-rtnl mode (TBD). Boot-tested and with `ethtool -L eth1 combined 24` to trigger reset. Cc: Saeed Mahameed <saeed@kernel.org> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250305163732.2766420-15-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-06docs: net: document new locking realityStanislav Fomichev
Also clarify ndo_get_stats (that read and write paths can run concurrently) and mention only RCU. Cc: Saeed Mahameed <saeed@kernel.org> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250305163732.2766420-14-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-06net: add option to request netdev instance lockStanislav Fomichev
Currently only the drivers that implement shaper or queue APIs are grabbing instance lock. Add an explicit opt-in for the drivers that want to grab the lock without implementing the above APIs. There is a 3-byte hole after @up, use it: /* --- cacheline 47 boundary (3008 bytes) --- */ u32 napi_defer_hard_irqs; /* 3008 4 */ bool up; /* 3012 1 */ /* XXX 3 bytes hole, try to pack */ struct mutex lock; /* 3016 144 */ /* XXX last struct has 1 hole */ Cc: Saeed Mahameed <saeed@kernel.org> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250305163732.2766420-13-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>