summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2024-11-13block: add a rq_list typeChristoph Hellwig
Replace the semi-open coded request list helpers with a proper rq_list type that mirrors the bio_list and has head and tail pointers. Besides better type safety this actually allows to insert at the tail of the list, which will be useful soon. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20241113152050.157179-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-13block: remove rq_list_moveChristoph Hellwig
Unused now. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20241113152050.157179-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-13dax: Remove an unused field in struct dax_operationsChristophe JAILLET
.dax_supported() was apparently removed by commit 7b0800d00dae ("dax: remove dax_capable") on 2021-11. Remove the now unused function pointer from the struct dax_operations. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/56b92b722ca0a6fd1387c871a6ec01bcb9bd525e.1725203804.git.christophe.jaillet@wanadoo.fr Signed-off-by: Ira Weiny <ira.weiny@intel.com>
2024-11-13block: export blk_validate_limitsChristoph Hellwig
While block drivers do the validation as part of committing them to the queue, users that use the limit outside of a block device context have to validate the limits and fill in the calculated values as well. So far btrfs is the only user of queue limits without a block device, and it has gotten away with that more or less by accident. But with commit 559218d43ec9 ("block: pre-calculate max_zone_append_sectors") this became fatal for setups that have small max zone append size, as it won't be limited now. Export blk_validate_limits so that it can be called directly from btrfs. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Link: https://lore.kernel.org/r/20241113084541.34315-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-13jbd2: avoid dozens of -Wflex-array-member-not-at-end warningsGustavo A. R. Silva
-Wflex-array-member-not-at-end was introduced in GCC-14, and we are getting ready to enable it, globally. Use the `DEFINE_RAW_FLEX()` helper for an on-stack definition of a flexible structure (`struct shash_desc`) where the size of the flexible-array member (`__ctx`) is known at compile-time, and refactor the rest of the code, accordingly. So, with this, fix 77 of the following warnings: include/linux/jbd2.h:1800:35: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/ZyU94w0IALVhc9Jy@kspp Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-11-13Merge tag 'nvme-6.13-2024-11-13' of git://git.infradead.org/nvme into ↵Jens Axboe
for-6.13/block Pull NVMe updates from Keith: "nvme updates for Linux 6.13 - Use uring_cmd helper (Pavel) - Host Memory Buffer allocation enhancements (Christoph) - Target persistent reservation support (Guixin) - Persistent reservation tracing (Guixen) - NVMe 2.1 specification support (Keith) - Rotational Meta Support (Matias, Wang, Keith) - Volatile cache detection enhancment (Guixen)" * tag 'nvme-6.13-2024-11-13' of git://git.infradead.org/nvme: (22 commits) nvmet: add tracing of reservation commands nvme: parse reservation commands's action and rtype to string nvmet: report ns's vwc not present nvme: check ns's volatile write cache not present nvme: add rotational support nvme: use command set independent id ns if available nvmet: support for csi identify ns nvmet: implement rotational media information log nvmet: implement endurance groups nvmet: declare 2.1 version compliance nvmet: implement crto property nvmet: implement supported features log nvmet: implement supported log pages nvmet: implement active command set ns list nvmet: implement id ns for nvm command set nvmet: support reservation feature nvme: add reservation command's defines nvme-core: remove repeated wq flags nvmet: make nvmet_wq visible in sysfs nvme-pci: use dma_alloc_noncontigous if possible ...
2024-11-13Merge tag 'qcom-drivers-for-6.13-2' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into soc/drivers A few more Qualcomm driver updates for v6.13 Make the Adreno driver invoke the SMMU aperture setup firmware function, which is required to allow the GPU to manage per-process page tables in some firmware versions - as an example Rb3Gen2 has no GPU without this. Add X1E Devkit to the list of devices that has functional EFI variable access through the uefisecapp. Flip the "manual slice configuration quirk" in the Qualcomm LLCC driver, as this only applies to a single platform, and introduce support for QCS8300, QCS615, SAR2130P, and SAR1130P. Lastly, add IPQ5424 and IPQ5404 to the Qualcomm socinfo driver. * tag 'qcom-drivers-for-6.13-2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: soc: qcom: ice: Remove the device_link field in qcom_ice drm/msm/adreno: Setup SMMU aparture for per-process page table firmware: qcom: scm: Introduce CP_SMMU_APERTURE_ID soc: qcom: socinfo: add IPQ5424/IPQ5404 SoC ID dt-bindings: arm: qcom,ids: add SoC ID for IPQ5424/IPQ5404 soc: qcom: llcc: Flip the manual slice configuration condition dt-bindings: firmware: qcom,scm: Document sm8750 SCM firmware: qcom: uefisecapp: Allow X1E Devkit devices soc: qcom: llcc: Add LLCC configuration for the QCS8300 platform dt-bindings: cache: qcom,llcc: Document the QCS8300 LLCC soc: qcom: llcc: Add configuration data for QCS615 dt-bindings: cache: qcom,llcc: Document the QCS615 LLCC soc: qcom: llcc: add support for SAR2130P and SAR1130P soc: qcom: llcc: use deciman integers for bit shift values dt-bindings: cache: qcom,llcc: document SAR2130P and SAR1130P Link: https://lore.kernel.org/r/20241113032425.356306-1-andersson@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-11-13Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfLinus Torvalds
Pull bpf fixes from Daniel Borkmann: - Fix a mismatching RCU unlock flavor in bpf_out_neigh_v6 (Jiawei Ye) - Fix BPF sockmap with kTLS to reject vsock and unix sockets upon kTLS context retrieval (Zijian Zhang) - Fix BPF bits iterator selftest for s390x (Hou Tao) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Fix mismatched RCU unlock flavour in bpf_out_neigh_v6 bpf: Add sk_is_inet and IS_ICSK check in tls_sw_has_ctx_tx/rx selftests/bpf: Use -4095 as the bad address for bits iterator
2024-11-13Merge tag 'mm-hotfixes-stable-2024-11-12-16-39' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "10 hotfixes, 7 of which are cc:stable. 7 are MM, 3 are not. All singletons" * tag 'mm-hotfixes-stable-2024-11-12-16-39' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm: swapfile: fix cluster reclaim work crash on rotational devices selftests: hugetlb_dio: fixup check for initial conditions to skip in the start mm/thp: fix deferred split queue not partially_mapped: fix mm/gup: avoid an unnecessary allocation call for FOLL_LONGTERM cases nommu: pass NULL argument to vma_iter_prealloc() ocfs2: fix UBSAN warning in ocfs2_verify_volume() nilfs2: fix null-ptr-deref in block_dirty_buffer tracepoint nilfs2: fix null-ptr-deref in block_touch_buffer tracepoint mm: page_alloc: move mlocked flag clearance into free_pages_prepare() mm: count zeromap read and set for swapout and swapin
2024-11-13fs: Simplify getattr interface function checking AT_GETATTR_NOSEC flagStefan Berger
Commit 8a924db2d7b5 ("fs: Pass AT_GETATTR_NOSEC flag to getattr interface function")' introduced the AT_GETATTR_NOSEC flag to ensure that the call paths only call vfs_getattr_nosec if it is set instead of vfs_getattr. Now, simplify the getattr interface functions of filesystems where the flag AT_GETATTR_NOSEC is checked. There is only a single caller of inode_operations getattr function and it is located in fs/stat.c in vfs_getattr_nosec. The caller there is the only one from which the AT_GETATTR_NOSEC flag is passed from. Two filesystems are checking this flag in .getattr and the flag is always passed to them unconditionally from only vfs_getattr_nosec: - ecryptfs: Simplify by always calling vfs_getattr_nosec in ecryptfs_getattr. From there the flag is passed to no other function and this function is not called otherwise. - overlayfs: Simplify by always calling vfs_getattr_nosec in ovl_getattr. From there the flag is passed to no other function and this function is not called otherwise. The query_flags in vfs_getattr_nosec will mask-out AT_GETATTR_NOSEC from any caller using AT_STATX_SYNC_TYPE as mask so that the flag is not important inside this function. Also, since no filesystem is checking the flag anymore, remove the flag entirely now, including the BUG_ON check that never triggered. The net change of the changes here combined with the original commit is that ecryptfs and overlayfs do not call vfs_getattr but only vfs_getattr_nosec. Fixes: 8a924db2d7b5 ("fs: Pass AT_GETATTR_NOSEC flag to getattr interface function") Reported-by: Al Viro <viro@zeniv.linux.org.uk> Closes: https://lore.kernel.org/linux-fsdevel/20241101011724.GN1350452@ZenIV/T/#u Cc: Tyler Hicks <code@tyhicks.com> Cc: ecryptfs@vger.kernel.org Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Amir Goldstein <amir73il@gmail.com> Cc: linux-unionfs@vger.kernel.org Cc: Christian Brauner <brauner@kernel.org> Cc: linux-fsdevel@vger.kernel.org Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-11-13statmount: add flag to retrieve unescaped optionsMiklos Szeredi
Filesystem options can be retrieved with STATMOUNT_MNT_OPTS, which returns a string of comma separated options, where some characters are escaped using the \OOO notation. Add a new flag, STATMOUNT_OPT_ARRAY, which instead returns the raw option values separated with '\0' charaters. Since escaped charaters are rare, this inteface is preferable for non-libmount users which likley don't want to deal with option de-escaping. Example code: if (st->mask & STATMOUNT_OPT_ARRAY) { const char *opt = st->str + st->opt_array; for (unsigned int i = 0; i < st->opt_num; i++) { printf("opt_array[%i]: <%s>\n", i, opt); opt += strlen(opt) + 1; } } Example ouput: (1) mnt_opts: <lowerdir+=/l\054w\054r,lowerdir+=/l\054w\054r1,upperdir=/upp\054r,workdir=/w\054rk,redirect_dir=nofollow,uuid=null> (2) opt_array[0]: <lowerdir+=/l,w,r> opt_array[1]: <lowerdir+=/l,w,r1> opt_array[2]: <upperdir=/upp,r> opt_array[3]: <workdir=/w,rk> opt_array[4]: <redirect_dir=nofollow> opt_array[5]: <uuid=null> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Link: https://lore.kernel.org/r/20241112101006.30715-1-mszeredi@redhat.com Acked-by: Jeff Layton <jlayton@kernel.org> [brauner: tweak variable naming and parsing add example output] Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-11-13Merge patch series "two little writeback cleanups v2"Christian Brauner
Christoph Hellwig <hch@lst.de> says: This fixes one (of multiple) sparse warnings in fs-writeback.c, and then reshuffles the code a bit that only the proper high level API instead of low-level helpers is exported. * patches from https://lore.kernel.org/r/20241112054403.1470586-1-hch@lst.de: writeback: wbc_attach_fdatawrite_inode out of line writeback: add a __releases annoation to wbc_attach_and_unlock_inode Link: https://lore.kernel.org/r/20241112054403.1470586-1-hch@lst.de Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-11-13fs: add the ability for statmount() to report the sb_sourceJeff Layton
/proc/self/mountinfo displays the source for the mount, but statmount() doesn't yet have a way to return it. Add a new STATMOUNT_SB_SOURCE flag, claim the 32-bit __spare1 field to hold the offset into the str[] array. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20241111-statmount-v4-3-2eaf35d07a80@kernel.org Acked-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-11-13ALSA: compress_offload: Add missing descriptions in structsTakashi Iwai
Add the missing descriptions for snd_compr_ops, snd_compr_task and snd_compr_task_status fields, in order to shut up the build warnings. Fixes: 04177158cf98 ("ALSA: compress_offload: introduce accel operation mode") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/20241028193731.4b0c3788@canb.auug.org.au Link: https://patch.msgid.link/20241113072304.4447-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-11-13clocksource/drivers/dw_apb: Remove unused dw_apb_clockevent functionsDr. David Alan Gilbert
dw_apb_clockevent_pause(), dw_apb_clockevent_resume() and dw_apb_clockevent_stop() have been unused since 2021's commit 1b79fc4f2bfd ("x86/apb_timer: Remove driver for deprecated platform") Remove them. (Some of the other clockevent functions are still called by dw_apb_timer_of.c so I guess it is still in use?) Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Link: https://lore.kernel.org/r/20241025203101.241709-1-linux@treblig.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-11-13ALSA: pcm: Define snd_pcm_mmap_data_{open|close}() locallyTakashi Iwai
snd_pcm_mmap_data_open() and _close() are defined as inline functions in the public sound/pcm.h, but those are used only locally in pcm_native.c, hence they should be better placed there. Also, those are referred as callbacks, the useless inline is dropped. Link: https://patch.msgid.link/20241113111628.17069-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-11-13Merge branch 'kvm-docs-6.13' into HEADPaolo Bonzini
- Drop obsolete references to PPC970 KVM, which was removed 10 years ago. - Fix incorrect references to non-existing ioctls - List registers supported by KVM_GET/SET_ONE_REG on s390 - Use rST internal links - Reorganize the introduction to the API document
2024-11-13Merge tag 'kvm-x86-generic-6.13' of https://github.com/kvm-x86/linux into HEADPaolo Bonzini
KVM generic changes for 6.13 - Rework kvm_vcpu_on_spin() to use a single for-loop instead of making two partial poasses over "all" vCPUs. Opportunistically expand the comment to better explain the motivation and logic. - Protect vcpu->pid accesses outside of vcpu->mutex with a rwlock instead of RCU, so that running a vCPU on a different task doesn't encounter long stalls due to having to wait for all CPUs become quiescent.
2024-11-13printk: add dummy printk_force_console_enter/exit helpersArnd Bergmann
The newly added interface is broken when PRINTK is disabled: drivers/tty/sysrq.c: In function '__handle_sysrq': drivers/tty/sysrq.c:601:9: error: implicit declaration of function 'printk_force_console_enter' [-Wimplicit-function-declaration] 601 | printk_force_console_enter(); | ^~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/tty/sysrq.c:611:25: error: implicit declaration of function 'printk_force_console_exit' [-Wimplicit-function-declaration] 611 | printk_force_console_exit(); | ^~~~~~~~~~~~~~~~~~~~~~~~~ Add empty stub functions for both. Fixes: ed76c07c6885 ("printk: Introduce FORCE_CON flag") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Marcos Paulo de Souza <mpdesouza@suse.com> Tested-by: Marcos Paulo de Souza <mpdesouza@suse.com> Link: https://lore.kernel.org/r/20241112142939.724093-1-arnd@kernel.org Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-11-13LoongArch: KVM: Add PCHPIC device supportXianglai Li
Add device model for PCHPIC interrupt controller, implemente basic create & destroy interface, and register device model to kvm device table. Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn> Signed-off-by: Xianglai Li <lixianglai@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-11-13LoongArch: KVM: Add EIOINTC device supportXianglai Li
Add device model for EIOINTC interrupt controller, implement basic create & destroy interfaces, and register device model to kvm device table. Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn> Signed-off-by: Xianglai Li <lixianglai@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-11-13LoongArch: KVM: Add IPI device supportXianglai Li
Add device model for IPI interrupt controller, implement basic create & destroy interfaces, and register device model to kvm device table. Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn> Signed-off-by: Xianglai Li <lixianglai@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-11-13LoongArch: KVM: Add iocsr and mmio bus simulation in kernelXianglai Li
Add iocsr and mmio memory read and write simulation to the kernel. When the VM accesses the device address space through iocsr instructions or mmio, it does not need to return to the qemu user mode but can directly completes the access in the kernel mode. Signed-off-by: Tianrui Zhao <zhaotianrui@loongson.cn> Signed-off-by: Xianglai Li <lixianglai@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-11-13ALSA: tidyup SNDRV_PCM_TRIGGER_xxx numberingKuninori Morimoto
pcm.h has SNDRV_PCM_TRIGGER_xxx, but it is missing "2". Fixup it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://patch.msgid.link/87ed3gsziy.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-11-13usb: cdns3: Synchronise PCI IDs via common data baseAndy Shevchenko
There are a few places in the kernel where PCI IDs for different Cadence USB controllers are being used. Besides different naming, they duplicate each other. Make this all in order by providing common definitions via PCI IDs database and use in all users. While doing that, rename definitions as Roger suggested. Suggested-by: Roger Quadros <rogerq@kernel.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20241112160125.2340972-1-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-11-12bpf: Add kernel symbol for struct_ops trampolineXu Kuohai
Without kernel symbols for struct_ops trampoline, the unwinder may produce unexpected stacktraces. For example, the x86 ORC and FP unwinders check if an IP is in kernel text by verifying the presence of the IP's kernel symbol. When a struct_ops trampoline address is encountered, the unwinder stops due to the absence of symbol, resulting in an incomplete stacktrace that consists only of direct and indirect child functions called from the trampoline. The arm64 unwinder is another example. While the arm64 unwinder can proceed across a struct_ops trampoline address, the corresponding symbol name is displayed as "unknown", which is confusing. Thus, add kernel symbol for struct_ops trampoline. The name is bpf__<struct_ops_name>_<member_name>, where <struct_ops_name> is the type name of the struct_ops, and <member_name> is the name of the member that the trampoline is linked to. Below is a comparison of stacktraces captured on x86 by perf record, before and after this patch. Before: ffffffff8116545d __lock_acquire+0xad ([kernel.kallsyms]) ffffffff81167fcc lock_acquire+0xcc ([kernel.kallsyms]) ffffffff813088f4 __bpf_prog_enter+0x34 ([kernel.kallsyms]) After: ffffffff811656bd __lock_acquire+0x30d ([kernel.kallsyms]) ffffffff81167fcc lock_acquire+0xcc ([kernel.kallsyms]) ffffffff81309024 __bpf_prog_enter+0x34 ([kernel.kallsyms]) ffffffffc000d7e9 bpf__tcp_congestion_ops_cong_avoid+0x3e ([kernel.kallsyms]) ffffffff81f250a5 tcp_ack+0x10d5 ([kernel.kallsyms]) ffffffff81f27c66 tcp_rcv_established+0x3b6 ([kernel.kallsyms]) ffffffff81f3ad03 tcp_v4_do_rcv+0x193 ([kernel.kallsyms]) ffffffff81d65a18 __release_sock+0xd8 ([kernel.kallsyms]) ffffffff81d65af4 release_sock+0x34 ([kernel.kallsyms]) ffffffff81f15c4b tcp_sendmsg+0x3b ([kernel.kallsyms]) ffffffff81f663d7 inet_sendmsg+0x47 ([kernel.kallsyms]) ffffffff81d5ab40 sock_write_iter+0x160 ([kernel.kallsyms]) ffffffff8149c67b vfs_write+0x3fb ([kernel.kallsyms]) ffffffff8149caf6 ksys_write+0xc6 ([kernel.kallsyms]) ffffffff8149cb5d __x64_sys_write+0x1d ([kernel.kallsyms]) ffffffff81009200 x64_sys_call+0x1d30 ([kernel.kallsyms]) ffffffff82232d28 do_syscall_64+0x68 ([kernel.kallsyms]) ffffffff8240012f entry_SYSCALL_64_after_hwframe+0x76 ([kernel.kallsyms]) Fixes: 85d33df357b6 ("bpf: Introduce BPF_MAP_TYPE_STRUCT_OPS") Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20241112145849.3436772-4-xukuohai@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-12bpf: Support private stack for struct_ops progsYonghong Song
For struct_ops progs, whether a particular prog uses private stack depends on prog->aux->priv_stack_requested setting before actual insn-level verification for that prog. One particular implementation is to piggyback on struct_ops->check_member(). The next patch has an example for this. The struct_ops->check_member() sets prog->aux->priv_stack_requested to be true which enables private stack usage. The struct_ops prog follows the same rule as kprobe/tracing progs after function bpf_enable_priv_stack(). For example, even a struct_ops prog requests private stack, it could still use normal kernel stack if the stack size is small (< 64 bytes). Similar to tracing progs, nested same cpu same prog run will be skipped. A field (recursion_detected()) is added to bpf_prog_aux structure. If bpf_prog->aux->recursion_detected is implemented by the struct_ops subsystem and nested same cpu/prog happens, the function will be triggered to report an error, collect related info, etc. Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20241112163933.2224962-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-12bpf, x86: Support private stack in jitYonghong Song
Private stack is allocated in function bpf_int_jit_compile() with alignment 8. Private stack allocation size includes the stack size determined by verifier and additional space to protect stack overflow and underflow. See below an illustration: ---> memory address increasing [8 bytes to protect overflow] [normal stack] [8 bytes to protect underflow] If overflow/underflow is detected, kernel messages will be emited in dmesg like BPF private stack overflow/underflow detected for prog Fx BPF Private stack overflow/underflow detected for prog bpf_prog_a41699c234a1567a_subprog1x Those messages are generated when I made some changes to jitted code to intentially cause overflow for some progs. For the jited prog, The x86 register 9 (X86_REG_R9) is used to replace bpf frame register (BPF_REG_10). The private stack is used per subprog per cpu. The X86_REG_R9 is saved and restored around every func call (not including tailcall) to maintain correctness of X86_REG_R9. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20241112163922.2224385-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-12bpf: Enable private stack for eligible subprogsYonghong Song
If private stack is used by any subprog, set that subprog prog->aux->jits_use_priv_stack to be true so later jit can allocate private stack for that subprog properly. Also set env->prog->aux->jits_use_priv_stack to be true if any subprog uses private stack. This is a use case for a single main prog (no subprogs) to use private stack, and also a use case for later struct-ops progs where env->prog->aux->jits_use_priv_stack will enable recursion check if any subprog uses private stack. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20241112163912.2224007-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-12bpf: Find eligible subprogs for private stack supportYonghong Song
Private stack will be allocated with percpu allocator in jit time. To avoid complexity at runtime, only one copy of private stack is available per cpu per prog. So runtime recursion check is necessary to avoid stack corruption. Current private stack only supports kprobe/perf_event/tp/raw_tp which has recursion check in the kernel, and prog types that use bpf trampoline recursion check. For trampoline related prog types, currently only tracing progs have recursion checking. To avoid complexity, all async_cb subprogs use normal kernel stack including those subprogs used by both main prog subtree and async_cb subtree. Any prog having tail call also uses kernel stack. To avoid jit penalty with private stack support, a subprog stack size threshold is set such that only if the stack size is no less than the threshold, private stack is supported. The current threshold is 64 bytes. This avoids jit penality if the stack usage is small. A useless 'continue' is also removed from a loop in func check_max_stack_depth(). Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20241112163907.2223839-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-12srcu: Remove smp_mb() from srcu_read_unlock_lite()Paul E. McKenney
The srcu_read_unlock_lite() function invokes __srcu_read_unlock() instead of __srcu_read_unlock_lite(), which means that it is doing an unnecessary smp_mb(). This is harmless other than the performance degradation. This commit therefore switches to __srcu_read_unlock_lite(). Reported-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Closes: https://lore.kernel.org/all/d07e8f4a-d5ff-4c8e-8e61-50db285c57e9@amd.com/ Fixes: c0f08d6b5a61 ("srcu: Add srcu_read_lock_lite() and srcu_read_unlock_lite()") Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12Merge tag 'samsung-soc-6.13' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into soc/arm Samsung mach/soc changes for v6.13 Few minor cleanups in platform data headers: drop unused declarations. * tag 'samsung-soc-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux: ASoC: samsung: Remove obsoleted declaration for s3c64xx_ac97_setup_gpio ARM: samsung: Remove obsoleted declaration for s3c_hwmon_set_platdata Link: https://lore.kernel.org/r/20241029081002.21106-3-krzysztof.kozlowski@linaro.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-11-12Merge tag 'qcom-arm64-for-6.13' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into soc/dt Qualcomm Arm64 DeviceTree changes for v6.13 Introduce descriptions of the 8cx Gen3-based Microsoft Surface Pro 9 5G, X Elite based Dell XPS 13 9345, the QCS9100 platform and the "Ride" development boards thereon, and the SM7325 platform and the Nothing Phone 1. MSM8998 gains support for HDMI. The Lenovo Miix 630 gains support for volume keys, audio and sensor DSPs, touchscreen, and its specific WiFi calibration variant. On QCM6490, Fairphone FP5 gains a thermistor adjacent to UFS/RAM, while the IDP gains UFS and WiFi support. For QCS6490 changes to Rb3Gen2 enables WiFi, Venus, PCIe, SD-card, and volume keys. Adreno speedbins are adjusted and PMU nodes' compatibles for the two clusters are corrected. The DB845C/RB3 and QRB5165 RB5 vision mezzanines are converted to DeviceTree overlays, and both gains CMA heap for libcamera to use. SA8775P gains GPI DMA support, support for controlling download mode (bootloader-assisted ramdump support), additional UARTs, and qcrypto support. The "Ride" development board gains WiFi and Bluetooth support. On SC8280XP (8cx Gen3) another UART is described, used in the Microsoft Surface 9 5G. The WiFi/BT combo chip's power management unit is described on the CRD and Lenovo ThinkPad X13s. On SDM630/660 the GPU SMMU and clock controller is added, as is the A2Noc and LPASS SMMU, and the DSP-based WiFi device. GPU, modem DSP and WiFi is then enabled on the Inforce 6560 development board. On SM8450 Hardware Development Kit, the WCN6855 is modelled to enable WiFi and Bluetooth. A "global" interrupt is defined on SM8450 PCIe RC controller, to enable hotplug. On X Elite, USB Type-C controllers are marked as usb-role-switch capable, the GICv3 ITS is enabled for PCIe. TCSR region is described and wired up to allow setting and cleaning the download mode (bootloader-assisted ramdump) flag, and residency numbers for C4/C5 are updated. USB role switch is enabled on Lenovo ThinkPad T14s and the ASUS Vivobook S15. The T14s also gains support for a second source trackpad. The Microsoft Surface Laptop gains LID switch and the USB Type-A connector attached to the multiport controller is enabled. The CRD has its HID device power supplies described. Application SMMU is flagged as DMA coherent across QDU1000, SC7180, SC8180X, SC8280XP, SDM670, SDM845, SM8150, SM8350, SM8450, and X1E80100. In addition to this, the effort to improve style and binding compliance continued. * tag 'qcom-arm64-for-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: (120 commits) arm64: dts: qcom: sdm845-db845c-navigation-mezzanine: Add cma heap for libcamera softisp support arm64: dts: qcom: qrb5165-rb5-vision-mezzanine: Add cma heap for libcamera softisp support arm64: dts: qcom: qrb5165-rb5-vision-mezzanine: Drop redundant clock-lanes from camera@1a arm64: dts: qcom: sc8280xp-x13s: Drop redundant clock-lanes from camera@10 arm64: dts: qcom: sdm845-db845c-navigation-mezzanine: Convert mezzanine riser to dtso arm64: dts: qcom: qrb5165-rb5-vision-mezzanine: Convert mezzanine riser to dtbo arm64: dts: qcom: sm8450-hdk: model the PMU of the on-board wcn6855 arm64: dts: qcom: sc8280xp-x13s: model the PMU of the on-board wcn6855 arm64: dts: qcom: sc8280xp-crd: enable bluetooth arm64: dts: qcom: sc8280xp-crd: model the PMU of the on-board wcn6855 arm64: dts: qcom: qcs9100: Add support for the QCS9100 Ride and Ride Rev3 boards dt-bindings: arm: qcom: Document qcs9100-ride and qcs9100-ride Rev3 arm64: dts: qcom: x1e80100: Update C4/C5 residency/exit numbers arm64: dts: qcom: x1e80100-crd: describe HID supplies arm64: dts: qcom: msm8998-lenovo-miix-630: add WiFi calibration variant arm64: dts: qcom: msm8998-clamshell: enable resin/VolDown arm64: dts: qcom: msm8998-lenovo-miix-630: enable VolumeUp button arm64: dts: qcom: msm8998-lenovo-miix-630: enable aDSP and SLPI arm64: dts: qcom: msm8998-lenovo-miix-630: enable touchscreen arm64: dts: qcom: qcs6490-rb3gen2: Add PCIe nodes ... Link: https://lore.kernel.org/r/20241105164901.7787-1-andersson@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-11-12Merge tag 'renesas-dts-for-v6.13-tag2' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel into soc/dt Renesas DTS updates for v6.13 (take two) - Add a CPU Operating Performance Points table for the RZ/V2H SoC, - Add Battery Backup Function (VBATTB) and RTC support for the RZ/G3S SoC and the RZ/G3S SMARC SoM, - Add DMAC support for MMC on the RZ/A1H SoC and the Genmai development board, - Miscellaneous fixes and improvements. * tag 'renesas-dts-for-v6.13-tag2' of https://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel: arm64: dts: renesas: rzg3s-smarc-som: Enable RTC arm64: dts: renesas: rzg3s-smarc-som: Enable VBATTB arm64: dts: renesas: r9a08g045: Add RTC node arm64: dts: renesas: r9a08g045: Add VBATTB node arm64: dts: renesas: white-hawk-cpu-common: Add pin control for DSI-eDP IRQ ARM: dts: renesas: r7s72100: Add DMA support to MMCIF ARM: dts: renesas: r7s72100: Add DMAC node arm64: dts: renesas: hihope: Drop #sound-dai-cells dt-bindings: clock: renesas,r9a08g045-vbattb: Document VBATTB dt-bindings: clock: r9a08g045-cpg: Add power domain ID for RTC arm64: dts: renesas: r9a09g057: Add OPP table Link: https://lore.kernel.org/r/cover.1730726155.git.geert+renesas@glider.be Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-11-12block: remove the ioprio field from struct requestChristoph Hellwig
The request ioprio is only initialized from the first attached bio, so requests without a bio already never set it. Directly use the bio field instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20241112170050.1612998-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-12block: remove the write_hint field from struct requestChristoph Hellwig
The write_hint is only used for read/write requests, which must have a bio attached to them. Just use the bio field instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20241112170050.1612998-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-11-12Merge tag 'samsung-dt64-6.13' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into soc/dt Samsung DTS ARM64 changes for v6.13 1. Add new SoC Samsung Exynos8895 and new board using it: Samsung Galaxy S8 (SM-G950F) mobile phone. Only small support so far: CPUs (Samsung Mongoose M2), main clock controllers (FSYS, PERIC, TOP), pin controllers, SPI for cameras, timers. 2. Add new SoC Samsung Exynos990 and new board using it: Samsung Galaxy Note20 5G (c1s/SM-N981B) mobile phone. Only minimal support so far: CPUs (Samsung Mongoose M5), pin controllers, timers. 3. Prepare for adding new SoC Samsung Exynos9810 - add bindings. The SoC DTSI was not yet ready, but it is posted on the mailing lists so should come soon. 4. ExynosAutov920: Add several clock controllers. * tag 'samsung-dt64-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux: dt-bindings: arm: samsung: Document Exynos9810 and starlte board binding dt-bindings: soc: samsung: exynos-pmu: Add exynos9810 compatible dt-bindings: arm: cpus: Add Samsung Mongoose M3 arm64: dts: exynos8895: Add spi_0/1 nodes arm64: dts: exynos8895: Add Multi Core Timer (MCT) node arm64: dts: exynos8895: Add clock management unit nodes dt-bindings: timer: exynos4210-mct: Add samsung,exynos8895-mct compatible dt-bindings: clock: samsung: Add Exynos8895 SoC arm64: dts: exynos: Add initial support for Samsung Galaxy Note20 5G (c1s) arm64: dts: exynos: Add initial support for the Exynos 990 SoC dt-bindings: arm: samsung: samsung-boards: Add bindings for Exynos 990 boards dt-bindings: arm: cpus: Add Samsung Mongoose M5 arm64: dts: exynosautov920: add peric1, misc and hsi0/1 clock DT nodes dt-bindings: clock: exynosautov920: add peric1, misc and hsi0/1 clock definitions arm64: dts: exynos: Add initial support for Samsung Galaxy S8 arm64: dts: exynos: Add initial support for exynos8895 SoC dt-bindings: soc: samsung: exynos-pmu: Add exynos8895 compatible dt-bindings: arm: samsung: Document dreamlte board binding dt-bindings: arm: cpus: Add Samsung Mongoose M2 Link: https://lore.kernel.org/r/20241029081002.21106-2-krzysztof.kozlowski@linaro.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-11-12dt-bindings: power: qcom,rpmpd: document the SM8750 RPMh Power DomainsTaniya Das
Document the RPMh Power Domains on the SM8750 Platform. Signed-off-by: Taniya Das <quic_tdas@quicinc.com> Signed-off-by: Jishnu Prakash <quic_jprakash@quicinc.com> Signed-off-by: Melody Olvera <quic_molvera@quicinc.com> Message-ID: <20241112002444.2802092-2-quic_molvera@quicinc.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2024-11-12iommu/arm-smmu-v3: Support IOMMU_HWPT_INVALIDATE using a VIOMMU objectNicolin Chen
Implement the vIOMMU's cache_invalidate op for user space to invalidate the IOTLB entries, Device ATS and CD entries that are cached by hardware. Add struct iommu_viommu_arm_smmuv3_invalidate defining invalidation entries that are simply in the native format of a 128-bit TLBI command. Scan those commands against the permitted command list and fix their VMID/SID fields to match what is stored in the vIOMMU. Link: https://patch.msgid.link/r/12-v4-9e99b76f3518+3a8-smmuv3_nesting_jgg@nvidia.com Co-developed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Eric Auger <eric.auger@redhat.com> Co-developed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-11-12iommu/arm-smmu-v3: Allow ATS for IOMMU_DOMAIN_NESTEDJason Gunthorpe
The EATS flag needs to flow through the vSTE and into the pSTE, and ensure physical ATS is enabled on the PCI device. The physical ATS state must match the VM's idea of EATS as we rely on the VM to issue the ATS invalidation commands. Thus ATS must remain off at the device until EATS on a nesting domain turns it on. Attaching a nesting domain is the point where the invalidation responsibility transfers to userspace. Update the ATS logic to track EATS for nesting domains and flush the ATC whenever the S2 nesting parent changes. Link: https://patch.msgid.link/r/11-v4-9e99b76f3518+3a8-smmuv3_nesting_jgg@nvidia.com Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-11-12iommu/arm-smmu-v3: Use S2FWB for NESTED domainsJason Gunthorpe
Force Write Back (FWB) changes how the S2 IOPTE's MemAttr field works. When S2FWB is supported and enabled the IOPTE will force cachable access to IOMMU_CACHE memory when nesting with a S1 and deny cachable access when !IOMMU_CACHE. When using a single stage of translation, a simple S2 domain, it doesn't change things for PCI devices as it is just a different encoding for the existing mapping of the IOMMU protection flags to cachability attributes. For non-PCI it also changes the combining rules when incoming transactions have inconsistent attributes. However, when used with a nested S1, FWB has the effect of preventing the guest from choosing a MemAttr in it's S1 that would cause ordinary DMA to bypass the cache. Consistent with KVM we wish to deny the guest the ability to become incoherent with cached memory the hypervisor believes is cachable so we don't have to flush it. Allow NESTED domains to be created if the SMMU has S2FWB support and use S2FWB for NESTING_PARENTS. This is an additional option to CANWBS. Link: https://patch.msgid.link/r/10-v4-9e99b76f3518+3a8-smmuv3_nesting_jgg@nvidia.com Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Donald Dutile <ddutile@redhat.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-11-12iommu/arm-smmu-v3: Support IOMMU_DOMAIN_NESTEDJason Gunthorpe
For SMMUv3 a IOMMU_DOMAIN_NESTED is composed of a S2 iommu_domain acting as the parent and a user provided STE fragment that defines the CD table and related data with addresses translated by the S2 iommu_domain. The kernel only permits userspace to control certain allowed bits of the STE that are safe for user/guest control. IOTLB maintenance is a bit subtle here, the S1 implicitly includes the S2 translation, but there is no way of knowing which S1 entries refer to a range of S2. For the IOTLB we follow ARM's guidance and issue a CMDQ_OP_TLBI_NH_ALL to flush all ASIDs from the VMID after flushing the S2 on any change to the S2. The IOMMU_DOMAIN_NESTED can only be created from inside a VIOMMU as the invalidation path relies on the VIOMMU to translate virtual stream ID used in the invalidation commands for the CD table and ATS. Link: https://patch.msgid.link/r/9-v4-9e99b76f3518+3a8-smmuv3_nesting_jgg@nvidia.com Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Donald Dutile <ddutile@redhat.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-11-12iommu/arm-smmu-v3: Support IOMMU_VIOMMU_ALLOCNicolin Chen
Add a new driver-type for ARM SMMUv3 to enum iommu_viommu_type. Implement an arm_vsmmu_alloc(). As an initial step, copy the VMID from s2_parent. A followup series is required to give the VIOMMU object it's own VMID that will be used in all nesting configurations. Link: https://patch.msgid.link/r/8-v4-9e99b76f3518+3a8-smmuv3_nesting_jgg@nvidia.com Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-11-12Merge branch 'iommufd/arm-smmuv3-nested' of iommu/linux into iommufd for-nextJason Gunthorpe
Common SMMUv3 patches for the following patches adding nesting, shared branch with the iommu tree. * 'iommufd/arm-smmuv3-nested' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/iommu/linux: iommu/arm-smmu-v3: Expose the arm_smmu_attach interface iommu/arm-smmu-v3: Implement IOMMU_HWPT_ALLOC_NEST_PARENT iommu/arm-smmu-v3: Support IOMMU_GET_HW_INFO via struct arm_smmu_hw_info iommu/arm-smmu-v3: Report IOMMU_CAP_ENFORCE_CACHE_COHERENCY for CANWBS ACPI/IORT: Support CANWBS memory access flag ACPICA: IORT: Update for revision E.f vfio: Remove VFIO_TYPE1_NESTING_IOMMU ... Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-11-12ARM: 9415/1: amba: Add dev_is_amba() function and export it for modulesKunwu Chan
Add dev_is_amba() function to determine whether the device is a AMBA device. Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2024-11-12iommufd/viommu: Add iommufd_viommu_find_dev helperNicolin Chen
This avoids a bigger trouble of exposing struct iommufd_device and struct iommufd_vdevice in the public header. Link: https://patch.msgid.link/r/84fa7c624db4d4508067ccfdf42059533950180a.1730836308.git.nicolinc@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-11-12iommu: Add iommu_copy_struct_from_full_user_array helperJason Gunthorpe
The iommu_copy_struct_from_user_array helper can be used to copy a single entry from a user array which might not be efficient if the array is big. Add a new iommu_copy_struct_from_full_user_array to copy the entire user array at once. Update the existing iommu_copy_struct_from_user_array kdoc accordingly. Link: https://patch.msgid.link/r/5cd773d9c26920c5807d232b21d415ea79172e49.1730836308.git.nicolinc@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-11-12iommufd: Allow hwpt_id to carry viommu_id for IOMMU_HWPT_INVALIDATENicolin Chen
With a vIOMMU object, use space can flush any IOMMU related cache that can be directed via a vIOMMU object. It is similar to the IOMMU_HWPT_INVALIDATE uAPI, but can cover a wider range than IOTLB, e.g. device/desciprtor cache. Allow hwpt_id of the iommu_hwpt_invalidate structure to carry a viommu_id, and reuse the IOMMU_HWPT_INVALIDATE uAPI for vIOMMU invalidations. Drivers can define different structures for vIOMMU invalidations v.s. HWPT ones. Since both the HWPT-based and vIOMMU-based invalidation pathways check own cache invalidation op, remove the WARN_ON_ONCE in the allocator. Update the uAPI, kdoc, and selftest case accordingly. Link: https://patch.msgid.link/r/b411e2245e303b8a964f39f49453a5dff280968f.1730836308.git.nicolinc@nvidia.com Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-11-12iommu/viommu: Add cache_invalidate to iommufd_viommu_opsNicolin Chen
This per-vIOMMU cache_invalidate op is like the cache_invalidate_user op in struct iommu_domain_ops, but wider, supporting device cache (e.g. PCI ATC invaldiations). Link: https://patch.msgid.link/r/90138505850fa6b165135e78a87b4cc7022869a4.1730836308.git.nicolinc@nvidia.com Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-11-12iommufd/viommu: Add IOMMUFD_OBJ_VDEVICE and IOMMU_VDEVICE_ALLOC ioctlNicolin Chen
Introduce a new IOMMUFD_OBJ_VDEVICE to represent a physical device (struct device) against a vIOMMU (struct iommufd_viommu) object in a VM. This vDEVICE object (and its structure) holds all the infos and attributes in the VM, regarding the device related to the vIOMMU. As an initial patch, add a per-vIOMMU virtual ID. This can be: - Virtual StreamID on a nested ARM SMMUv3, an index to a Stream Table - Virtual DeviceID on a nested AMD IOMMU, an index to a Device Table - Virtual RID on a nested Intel VT-D IOMMU, an index to a Context Table Potentially, this vDEVICE structure would hold some vData for Confidential Compute Architecture (CCA). Use this virtual ID to index an "vdevs" xarray that belongs to a vIOMMU object. Add a new ioctl for vDEVICE allocations. Since a vDEVICE is a connection of a device object and an iommufd_viommu object, take two refcounts in the ioctl handler. Link: https://patch.msgid.link/r/cda8fd2263166e61b8191a3b3207e0d2b08545bf.1730836308.git.nicolinc@nvidia.com Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>