summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-11-07media: videobuf2-core: copy vb planes unconditionallyTudor Ambarus
Copy the relevant data from userspace to the vb->planes unconditionally as it's possible some of the fields may have changed after the buffer has been validated. Keep the dma_buf_put(planes[plane].dbuf) calls in the first `if (!reacquired)` case, in order to be close to the plane validation code where the buffers were got in the first place. Cc: stable@vger.kernel.org Fixes: 95af7c00f35b ("media: videobuf2-core: release all planes first in __prepare_dmabuf()") Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org> Tested-by: Will McVicker <willmcvicker@google.com> Acked-by: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
2024-11-07Merge tag 'nf-next-24-11-07' of ↵Paolo Abeni
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following series contains Netfilter updates for net-next: 1) Make legacy xtables configs user selectable, from Breno Leitao. 2) Fix a few sparse warnings related to percpu, from Uros Bizjak. 3) Use strscpy_pad, from Justin Stitt. 4) Use nft_trans_elem_alloc() in catchall flush, from Florian Westphal. 5) A series of 7 patches to fix false positive with CONFIG_RCU_LIST=y. Florian also sees possible issue with 10 while module load/removal when requesting an expression that is available via module. As for patch 11, object is being updated so reference on the module already exists so I don't see any real issue. Florian says: "Unfortunately there are many more errors, and not all are false positives. First patches pass lockdep_commit_lock_is_held() to the rcu list traversal macro so that those splats are avoided. The last two patches are real code change as opposed to 'pass the transaction mutex to relax rcu check': Those two lists are not protected by transaction mutex so could be altered in parallel. This targets nf-next because these are long-standing issues." netfilter pull request 24-11-07 * tag 'nf-next-24-11-07' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: nf_tables: must hold rcu read lock while iterating object type list netfilter: nf_tables: must hold rcu read lock while iterating expression type list netfilter: nf_tables: avoid false-positive lockdep splats with basechain hook netfilter: nf_tables: avoid false-positive lockdep splats in set walker netfilter: nf_tables: avoid false-positive lockdep splats with flowtables netfilter: nf_tables: avoid false-positive lockdep splats with sets netfilter: nf_tables: avoid false-positive lockdep splat on rule deletion netfilter: nf_tables: prefer nft_trans_elem_alloc helper netfilter: nf_tables: replace deprecated strncpy with strscpy_pad netfilter: nf_tables: Fix percpu address space issues in nf_tables_api.c netfilter: Make legacy configs user selectable ==================== Link: https://patch.msgid.link/20241106234625.168468-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-07Merge branch 'virtio_net-make-rss-interact-properly-with-queue-number'Paolo Abeni
Philo Lu says: ==================== virtio_net: Make RSS interact properly with queue number With this patch set, RSS updates with queue_pairs changing: - When virtnet_probe, init default rss and commit - When queue_pairs changes _without_ user rss configuration, update rss with the new queue number - When queue_pairs changes _with_ user rss configuration, keep rss as user configured Patch 1 and 2 fix possible out of bound errors for indir_table and key. Patch 3 and 4 add RSS update in probe() and set_queues(). ==================== Link: https://patch.msgid.link/20241104085706.13872-1-lulie@linux.alibaba.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-07virtio_net: Update rss when set queuePhilo Lu
RSS configuration should be updated with queue number. In particular, it should be updated when (1) rss enabled and (2) default rss configuration is used without user modification. During rss command processing, device updates queue_pairs using rss.max_tx_vq. That is, the device updates queue_pairs together with rss, so we can skip the sperate queue_pairs update (VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET below) and return directly. Also remove the `vi->has_rss ?` check when setting vi->rss.max_tx_vq, because this is not used in the other hash_report case. Fixes: c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.") Signed-off-by: Philo Lu <lulie@linux.alibaba.com> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-07virtio_net: Sync rss config to device when virtnet_probePhilo Lu
During virtnet_probe, default rss configuration is initialized, but was not committed to the device. This patch fix this by sending rss command after device ready in virtnet_probe. Otherwise, the actual rss configuration used by device can be different with that read by user from driver, which may confuse the user. If the command committing fails, driver rss will be disabled. Fixes: c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.") Signed-off-by: Philo Lu <lulie@linux.alibaba.com> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Joe Damato <jdamato@fastly.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-07virtio_net: Add hash_key_length checkPhilo Lu
Add hash_key_length check in virtnet_probe() to avoid possible out of bound errors when setting/reading the hash key. Fixes: c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.") Signed-off-by: Philo Lu <lulie@linux.alibaba.com> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Joe Damato <jdamato@fastly.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-07virtio_net: Support dynamic rss indirection table sizePhilo Lu
When reading/writing virtio_net_ctrl_rss, we get the indirection table size from vi->rss_indir_table_size, which is initialized in virtnet_probe(). However, the actual size of indirection_table was set as VIRTIO_NET_RSS_MAX_TABLE_LEN=128. This collision may cause issues if the vi->rss_indir_table_size exceeds 128. This patch instead uses dynamic indirection table, allocated with vi->rss after vi->rss_indir_table_size initialized. And free it in virtnet_remove(). In virtnet_commit_rss_command(), sgs for rss is initialized differently with hash_report. So indirection_table is not used if !vi->has_rss, and then we don't need to alloc indirection_table for hash_report only uses. Fixes: c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.") Signed-off-by: Philo Lu <lulie@linux.alibaba.com> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Joe Damato <jdamato@fastly.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-07arm64: fix .data.rel.ro size assertion when CONFIG_LTO_CLANGMasahiro Yamada
Commit be2881824ae9 ("arm64/build: Assert for unwanted sections") introduced an assertion to ensure that the .data.rel.ro section does not exist. However, this check does not work when CONFIG_LTO_CLANG is enabled, because .data.rel.ro matches the .data.[0-9a-zA-Z_]* pattern in the DATA_MAIN macro. Move the ASSERT() above the RW_DATA() line. Fixes: be2881824ae9 ("arm64/build: Assert for unwanted sections") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20241106161843.189927-1-masahiroy@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-07netfilter: nf_tables: wait for rcu grace period on net_device removalPablo Neira Ayuso
8c873e219970 ("netfilter: core: free hooks with call_rcu") removed synchronize_net() call when unregistering basechain hook, however, net_device removal event handler for the NFPROTO_NETDEV was not updated to wait for RCU grace period. Note that 835b803377f5 ("netfilter: nf_tables_netdev: unregister hooks on net_device removal") does not remove basechain rules on device removal, I was hinted to remove rules on net_device removal later, see 5ebe0b0eec9d ("netfilter: nf_tables: destroy basechain and rules on netdevice removal"). Although NETDEV_UNREGISTER event is guaranteed to be handled after synchronize_net() call, this path needs to wait for rcu grace period via rcu callback to release basechain hooks if netns is alive because an ongoing netlink dump could be in progress (sockets hold a reference on the netns). Note that nf_tables_pre_exit_net() unregisters and releases basechain hooks but it is possible to see NETDEV_UNREGISTER at a later stage in the netns exit path, eg. veth peer device in another netns: cleanup_net() default_device_exit_batch() unregister_netdevice_many_notify() notifier_call_chain() nf_tables_netdev_event() __nft_release_basechain() In this particular case, same rule of thumb applies: if netns is alive, then wait for rcu grace period because netlink dump in the other netns could be in progress. Otherwise, if the other netns is going away then no netlink dump can be in progress and basechain hooks can be released inmediately. While at it, turn WARN_ON() into WARN_ON_ONCE() for the basechain validation, which should not ever happen. Fixes: 835b803377f5 ("netfilter: nf_tables_netdev: unregister hooks on net_device removal") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-11-07arm64: Kconfig: Make SME depend on BROKEN for nowMark Rutland
Although support for SME was merged in v5.19, we've since uncovered a number of issues with the implementation, including issues which might corrupt the FPSIMD/SVE/SME state of arbitrary tasks. While there are patches to address some of these issues, ongoing review has highlighted additional functional problems, and more time is necessary to analyse and fix these. For now, mark SME as BROKEN in the hope that we can fix things properly in the near future. As SME is an OPTIONAL part of ARMv9.2+, and there is very little extant hardware, this should not adversely affect the vast majority of users. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: stable@vger.kernel.org # 5.19 Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20241106164220.2789279-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-11-07arm64: smccc: Remove broken support for SMCCCv1.3 SVE discard hintMark Rutland
SMCCCv1.3 added a hint bit which callers can set in an SMCCC function ID (AKA "FID") to indicate that it is acceptable for the SMCCC implementation to discard SVE and/or SME state over a specific SMCCC call. The kernel support for using this hint is broken and SMCCC calls may clobber the SVE and/or SME state of arbitrary tasks, though FPSIMD state is unaffected. The kernel support is intended to use the hint when there is no SVE or SME state to save, and to do this it checks whether TIF_FOREIGN_FPSTATE is set or TIF_SVE is clear in assembly code: | ldr <flags>, [<current_task>, #TSK_TI_FLAGS] | tbnz <flags>, #TIF_FOREIGN_FPSTATE, 1f // Any live FP state? | tbnz <flags>, #TIF_SVE, 2f // Does that state include SVE? | | 1: orr <fid>, <fid>, ARM_SMCCC_1_3_SVE_HINT | 2: | << SMCCC call using FID >> This is not safe as-is: (1) SMCCC calls can be made in a preemptible context and preemption can result in TIF_FOREIGN_FPSTATE being set or cleared at arbitrary points in time. Thus checking for TIF_FOREIGN_FPSTATE provides no guarantee. (2) TIF_FOREIGN_FPSTATE only indicates that the live FP/SVE/SME state in the CPU does not belong to the current task, and does not indicate that clobbering this state is acceptable. When the live CPU state is clobbered it is necessary to update fpsimd_last_state.st to ensure that a subsequent context switch will reload FP/SVE/SME state from memory rather than consuming the clobbered state. This and the SMCCC call itself must happen in a critical section with preemption disabled to avoid races. (3) Live SVE/SME state can exist with TIF_SVE clear (e.g. with only TIF_SME set), and checking TIF_SVE alone is insufficient. Remove the broken support for the SMCCCv1.3 SVE saving hint. This is effectively a revert of commits: * cfa7ff959a78 ("arm64: smccc: Support SMCCC v1.3 SVE register saving hint") * a7c3acca5380 ("arm64: smccc: Save lr before calling __arm_smccc_sve_check()") ... leaving behind the ARM_SMCCC_VERSION_1_3 and ARM_SMCCC_1_3_SVE_HINT definitions, since these are simply definitions from the SMCCC specification, and the latter is used in KVM via ARM_SMCCC_CALL_HINTS. If we want to bring this back in future, we'll probably want to handle this logic in C where we can use all the usual FPSIMD/SVE/SME helper functions, and that'll likely require some rework of the SMCCC code and/or its callers. Fixes: cfa7ff959a78 ("arm64: smccc: Support SMCCC v1.3 SVE register saving hint") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: stable@vger.kernel.org Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241106160448.2712997-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-11-07x86/sev: Cleanup vc_handle_msr()Borislav Petkov (AMD)
Carve out the MSR_SVSM_CAA into a helper with the suggestion that upcoming future users should do the same. Rename that silly exit_info_1 into what it actually means in this function - whether the MSR access is a read or a write. No functional changes. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com> Link: https://lore.kernel.org/r/20241106172647.GAZyum1zngPDwyD2IJ@fat_crate.local
2024-11-07pwm: Assume a disabled PWM to emit a constant inactive outputUwe Kleine-König
Some PWM hardwares (e.g. MC33XS2410) cannot implement a zero duty cycle but can instead disable the hardware which also results in a constant inactive output. There are some checks (enabled with CONFIG_PWM_DEBUG) to help implementing a driver without violating the normal rounding rules. Make them less strict to let above described hardware pass without warning. Reported-by: Dimitri Fedrau <dima.fedrau@gmail.com> Link: https://lore.kernel.org/r/20241103205215.GA509903@debian Fixes: 3ad1f3a33286 ("pwm: Implement some checks for lowlevel drivers") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Reviewed-by: Dimitri Fedrau <dima.fedrau@gmail.com> Tested-by: Dimitri Fedrau <dima.fedrau@gmail.com> Link: https://lore.kernel.org/r/20241105153521.1001864-2-u.kleine-koenig@baylibre.com Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
2024-11-07media: venus: factor out inst destruction routineSergey Senozhatsky
Factor out common instance destruction code into a common function. Suggested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Signed-off-by: Stanimir Varbanov <stanimir.k.varbanov@gmail.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
2024-11-07media: venus: sync with threaded IRQ during inst destructionSergey Senozhatsky
When destroying an inst we should make sure that we don't race against threaded IRQ (or pending IRQ), otherwise we can concurrently kfree() inst context and inst itself. BUG: KASAN: slab-use-after-free in vb2_queue_error+0x80/0x90 Call trace: dump_backtrace+0x1c4/0x1f8 show_stack+0x38/0x60 dump_stack_lvl+0x168/0x1f0 print_report+0x170/0x4c8 kasan_report+0x94/0xd0 __asan_report_load2_noabort+0x20/0x30 vb2_queue_error+0x80/0x90 venus_helper_vb2_queue_error+0x54/0x78 venc_event_notify+0xec/0x158 hfi_event_notify+0x878/0xd20 hfi_process_msg_packet+0x27c/0x4e0 venus_isr_thread+0x258/0x6e8 hfi_isr_thread+0x70/0x90 venus_isr_thread+0x34/0x50 irq_thread_fn+0x88/0x130 irq_thread+0x160/0x2c0 kthread+0x294/0x328 ret_from_fork+0x10/0x20 Allocated by task 20291: kasan_set_track+0x4c/0x80 kasan_save_alloc_info+0x28/0x38 __kasan_kmalloc+0x84/0xa0 kmalloc_trace+0x7c/0x98 v4l2_m2m_ctx_init+0x74/0x280 venc_open+0x444/0x6d0 v4l2_open+0x19c/0x2a0 chrdev_open+0x374/0x3f0 do_dentry_open+0x710/0x10a8 vfs_open+0x88/0xa8 path_openat+0x1e6c/0x2700 do_filp_open+0x1a4/0x2e0 do_sys_openat2+0xe8/0x508 do_sys_open+0x15c/0x1a0 __arm64_sys_openat+0xa8/0xc8 invoke_syscall+0xdc/0x270 el0_svc_common+0x1ec/0x250 do_el0_svc+0x54/0x70 el0_svc+0x50/0xe8 el0t_64_sync_handler+0x48/0x120 el0t_64_sync+0x1a8/0x1b0 Freed by task 20291: kasan_set_track+0x4c/0x80 kasan_save_free_info+0x3c/0x60 ____kasan_slab_free+0x124/0x1a0 __kasan_slab_free+0x18/0x28 __kmem_cache_free+0x134/0x300 kfree+0xc8/0x1a8 v4l2_m2m_ctx_release+0x44/0x60 venc_close+0x78/0x130 [venus_enc] v4l2_release+0x20c/0x2f8 __fput+0x328/0x7f0 ____fput+0x2c/0x48 task_work_run+0x1e0/0x280 get_signal+0xfb8/0x1190 do_notify_resume+0x34c/0x16a8 el0_svc+0x9c/0xe8 el0t_64_sync_handler+0x48/0x120 el0t_64_sync+0x1a8/0x1b0 Rearrange inst destruction. First remove the inst from the core->instances list, second synchronize IRQ/IRQ-thread to make sure that nothing else would see the inst while we take it down. Fixes: 7472c1c69138 ("[media] media: venus: vdec: add video decoder files") Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Signed-off-by: Stanimir Varbanov <stanimir.k.varbanov@gmail.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
2024-11-07media: venus: fix enc/dec destruction orderSergey Senozhatsky
We destroy mutex-es too early as they are still taken in v4l2_fh_exit()->v4l2_event_unsubscribe()->v4l2_ctrl_find(). We should destroy mutex-es right before kfree(). Also do not vdec_ctrl_deinit() before v4l2_fh_exit(). Fixes: 7472c1c69138 ("[media] media: venus: vdec: add video decoder files") Suggested-by: Tomasz Figa <tfiga@google.com> Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Signed-off-by: Stanimir Varbanov <stanimir.k.varbanov@gmail.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
2024-11-07media: venus: Enable h.264 hierarchical codingFritz Koenig
HFI supports hierarchical P encoding and the ability to specify the bitrate for the different layers. Connect the controls that V4L2 provides and HFI supports. Signed-off-by: Fritz Koenig <frkoenig@chromium.org> Reviewed-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Signed-off-by: Stanimir Varbanov <stanimir.k.varbanov@gmail.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
2024-11-07media: venus: Helper function for dynamically updating bitrateFritz Koenig
Move the dynamic bitrate updating functionality to a separate function so that it can be shared. No functionality changes. Signed-off-by: Fritz Koenig <frkoenig@chromium.org> Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Signed-off-by: Stanimir Varbanov <stanimir.k.varbanov@gmail.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
2024-11-07eth: fbnic: Add support to write TCE TCAM entriesMohsin Bashir
Add support to redirect host-to-BMC traffic by writing MACDA entries from the RPC (RX Parser and Classifier) to TCE-TCAM. The TCE TCAM is a small L2 destination TCAM which is placed at the end of the TX path (TCE). Unlike other NICs, where BMC diversion is typically handled by firmware, for fbnic, firmware does not touch anything related to the host; hence, the host uses TCE TCAM to divert BMC traffic. Currently, we lack metadata to track where addresses have been written in the TCAM, except for the last entry written. To address this issue, we start at the opposite end of the table in each pass, so that adding or deleting entries does not affect the availability of all entries, assuming there is no significant reordering of entries. Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Link: https://patch.msgid.link/20241104031300.1330657-1-mohsin.bashr@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-07Merge tag 'intel-gpio-v6.13-1' of ↵Bartosz Golaszewski
git://git.kernel.org/pub/scm/linux/kernel/git/andy/linux-gpio-intel into gpio/for-next intel-gpio for v6.13-1 * Use device_for_each_child_node_scoped() in ACPI routines The following is an automated git shortlog grouped by driver: acpi: - switch to device_for_each_child_node_scoped()
2024-11-07s390/pci: Add header guards and includes to internal headersNiklas Schnelle
While pci_iov.h has include guards it doesn't include the necessary header for struct zpci_dev, pci_bus.h on the other hand lacks even basic include guards. This isn't only fragile and breaks convention but also confuses tooling such as clang-analyzer. Add both include guards and the necessary includes. Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-07s390/uvdevice: Fix and slightly improve kernel-doc commentHeiko Carstens
Fix incorrect kernel-doc comment style, add missing return statement, fix incorrect parameter name, and add some additional consistency across all kernel-doc comments. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-07s390/uvdevice: Support longer secret listsSteffen Eiden
Enable the list IOCTL to provide lists longer than one page (85 entries). The list IOCTL now accepts any argument length in page granularity. It fills the argument up to this length with entries until the list ends. User space unaware of this enhancement will still receive one page of data and an uv_rc 0x0100. Signed-off-by: Steffen Eiden <seiden@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Link: https://lore.kernel.org/r/20241104153609.1361388-1-seiden@linux.ibm.com Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Message-ID: <20241104153609.1361388-1-seiden@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-07s390/sparsemem: Provide phys_to_target_node() with CONFIG_NUMAHeiko Carstens
Add a trival phys_to_target_node() implementation which always returns 0 if CONFIG_NUMA is enabled, since the s390 NUMA implementation only supports node 0. This is similar to memory_add_physaddr_to_nid() in order to avoid runtime warnings. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-07s390/configs: Enable CONFIG_VIRTIO_MEMHeiko Carstens
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-07Merge branch 'virtio-mem' into featuresHeiko Carstens
David Hildenbrand says: ==================== virtio-mem: s390 support Let's finally add s390 support for virtio-mem; my last RFC was sent 4 years ago, and a lot changed in the meantime. The latest QEMU series is available at [1], which contains some more details and a usage example on s390 (last patch). There is not too much in here: The biggest part is querying a new diag(500) STORAGE_LIMIT hypercall to obtain the proper "max_physmem_end". The last three patches are not strictly required but certainly nice-to-have. Note that -- in contrast to standby memory -- virtio-mem memory must be configured to be automatically onlined as soon as hotplugged. The easiest approach is using the "memhp_default_state=" kernel parameter or by using proper udev rules. More details can be found at [2]. I have reviving+upstreaming a systemd service to handle configuring that on my todo list, but for some reason I keep getting distracted ... I tested various things, including: * Various memory hotplug/hotunplug combinations * Device hotplug/hotunplug * /proc/iomem output * reboot * kexec * kdump: make sure we properly enter the "kdump mode" in the virtio-mem driver kdump support for virtio-mem memory on s390 will be sent out separately. v2 -> v3 * "s390/kdump: make is_kdump_kernel() consistently return "true" in kdump environments only" -> Sent out separately [3] * "s390/physmem_info: query diag500(STORAGE LIMIT) to support QEMU/KVM memory devices" -> No query function for diag500 for now. -> Update comment above setup_ident_map_size(). -> Optimize/rewrite diag500_storage_limit() [Heiko] -> Change handling in detect_physmem_online_ranges [Alexander] -> Improve documentation. * "s390/sparsemem: provide memory_add_physaddr_to_nid() with CONFIG_NUMA" -> Added after testing on systems with CONFIG_NUMA=y v1 -> v2: * Document the new diag500 subfunction * Use "s390" instead of "s390x" consistently [1] https://lkml.kernel.org/r/20241008105455.2302628-1-david@redhat.com [2] https://virtio-mem.gitlab.io/user-guide/user-guide-linux.html [3] https://lkml.kernel.org/r/20241023090651.1115507-1-david@redhat.com ==================== Link: https://lore.kernel.org/r/20241025141453.1210600-1-david@redhat.com/ Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-07s390/sparsemem: Provide memory_add_physaddr_to_nid() with CONFIG_NUMADavid Hildenbrand
virtio-mem uses memory_add_physaddr_to_nid() to determine the NID to use for memory it adds. We currently fallback to the dummy implementation in mm/numa.c with CONFIG_NUMA, which will end up triggering an undesired pr_info_once(): Unknown online node for memory at 0x100000000, assuming node 0 On s390, we map all cpus and memory to node 0, so let's add a simple memory_add_physaddr_to_nid() implementation that does exactly that, but without complaining. Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Tested-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Link: https://lore.kernel.org/r/20241025141453.1210600-8-david@redhat.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-07s390/sparsemem: Reduce section size to 128 MiBDavid Hildenbrand
Ever since commit 421c175c4d609 ("[S390] Add support for memory hot-add.") we've been using a section size of 256 MiB on s390 and 32 MiB on s390. Before that, we were using a section size of 32 MiB on both architectures. Now that we have a new mechanism to expose additional memory to a VM -- virtio-mem -- reduce the section size to 128 MiB to allow for more flexibility and reduce the metadata overhead when dealing with hot(un)plug granularity smaller than 256 MiB. 128 MiB has been used by x86-64 since the very beginning. arm64 with 4k base pages switched to 128 MiB as well: it's just big enough on these architectures to allows for using a huge page (2 MiB) in the vmemmap in sane setups with sizeof(struct page) == 64 bytes and a huge page mapping in the direct mapping, while still allowing for small hot(un)plug granularity. For s390, we could even switch to a 64 MiB section size, as our huge page size is 1 MiB: but the smaller the section size, the more sections we'll have to manage especially on bigger machines. Making it consistent with x86-64 and arm64 feels like the right thing for now. Note that the smallest memory hot(un)plug granularity is also limited by the memory block size, determined by extracting the memory increment size from SCLP. Under QEMU/KVM, implementing virtio-mem, we expose 0; therefore, we'll end up with a memory block size of 128 MiB with a 128 MiB section size. Tested-by: Mario Casquero <mcasquer@redhat.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: David Hildenbrand <david@redhat.com> Tested-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Link: https://lore.kernel.org/r/20241025141453.1210600-7-david@redhat.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-07lib/Kconfig.debug: Default STRICT_DEVMEM to "y" on s390David Hildenbrand
virtio-mem currently depends on !DEVMEM | STRICT_DEVMEM. Let's default STRICT_DEVMEM to "y" just like we do for arm64 and x86. There could be ways in the future to filter access to virtio-mem device memory even without STRICT_DEVMEM, but for now let's just keep it simple. Tested-by: Mario Casquero <mcasquer@redhat.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: David Hildenbrand <david@redhat.com> Tested-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Link: https://lore.kernel.org/r/20241025141453.1210600-6-david@redhat.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-07virtio-mem: s390 supportDavid Hildenbrand
Now that s390 code is prepared for memory devices that reside above the maximum storage increment exposed through SCLP, everything is in place to unlock virtio-mem support. As virtio-mem in Linux currently supports logically onlining/offlining memory in pageblock granularity, we have an effective hot(un)plug granularity of 1 MiB on s390. As virito-mem adds/removes individual Linux memory blocks (256MB), we will currently never use gigantic pages in the identity mapping. It is worth noting that neither storage keys nor storage attributes (e.g., data / nodat) are touched when onlining memory blocks, which is good because we are not supposed to touch these parts for unplugged device blocks that are logically offline in Linux. We will currently never initialize storage keys for virtio-mem memory -- IOW, storage_key_init_range() is never called. It could be added in the future when plugging device blocks. But as that function essentially does nothing without modifying the code (changing PAGE_DEFAULT_ACC), that's just fine for now. kexec should work as intended and just like on other architectures that support virtio-mem: we will never place kexec binaries on virtio-mem memory, and never indicate virtio-mem memory to the 2nd kernel. The device driver in the 2nd kernel can simply reset the device -- turning all memory unplugged, to then start plugging memory and adding them to Linux, without causing trouble because the memory is already used elsewhere. The special s390 kdump mode, whereby the 2nd kernel creates the ELF core header, won't currently dump virtio-mem memory. The virtio-mem driver has a special kdump mode, from where we can detect memory ranges to dump. Based on this, support for dumping virtio-mem memory can be added in the future fairly easily. Acked-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Tested-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Link: https://lore.kernel.org/r/20241025141453.1210600-5-david@redhat.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-07s390/physmem_info: Query diag500(STORAGE LIMIT) to support QEMU/KVM memory ↵David Hildenbrand
devices To support memory devices under QEMU/KVM, such as virtio-mem, we have to prepare our kernel virtual address space accordingly and have to know the highest possible physical memory address we might see later: the storage limit. The good old SCLP interface is not suitable for this use case. In particular, memory owned by memory devices has no relationship to storage increments, it is always detected using the device driver, and unaware OSes (no driver) must never try making use of that memory. Consequently this memory is located outside of the "maximum storage increment"-indicated memory range. Let's use our new diag500 STORAGE_LIMIT subcode to query this storage limit that can exceed the "maximum storage increment", and use the existing interfaces (i.e., SCLP) to obtain information about the initial memory that is not owned+managed by memory devices. If a hypervisor does not support such memory devices, the address exposed through diag500 STORAGE_LIMIT will correspond to the maximum storage increment exposed through SCLP. To teach kdump on s390 to include memory owned by memory devices, there will be ways to query the relevant memory ranges from the device via a driver running in special kdump mode (like virtio-mem already implements to filter /proc/vmcore access so we don't end up reading from unplugged device blocks). Update setup_ident_map_size(), to clarify that there can be more than just online and standby memory. Tested-by: Mario Casquero <mcasquer@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Tested-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Link: https://lore.kernel.org/r/20241025141453.1210600-4-david@redhat.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-07Documentation: s390-diag.rst: Document diag500(STORAGE LIMIT) subfunctionDavid Hildenbrand
Let's document our new diag500 subfunction that can be implemented by userspace. Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Tested-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Link: https://lore.kernel.org/r/20241025141453.1210600-3-david@redhat.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-07Documentation: s390-diag.rst: Make diag500 a generic KVM hypercallDavid Hildenbrand
Let's make it a generic KVM hypercall, allowing other subfunctions to be more independent of virtio. While at it, document that unsupported/unimplemented subfunctions result in a SPECIFICATION exception. This is a preparation for documenting a new subfunction. Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Tested-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Link: https://lore.kernel.org/r/20241025141453.1210600-2-david@redhat.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-07s390/kvm: Mask extra bits from program interrupt codeClaudio Imbrenda
The program interrupt code has some extra bits that are sometimes set by hardware for various reasons; those bits should be ignored when the program interrupt number is needed for interrupt handling. Fixes: 05066cafa925 ("s390/mm/fault: Handle guest-related program interrupts in KVM") Reported-by: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Tested-by: Christian Borntraeger <borntraeger@linux.ibm.com> Link: https://lore.kernel.org/r/20241031120316.25462-1-imbrenda@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-07net: nfc: Propagate ISO14443 type A target ATS to userspace via netlinkJuraj Šarinay
Add a 20-byte field ats to struct nfc_target and expose it as NFC_ATTR_TARGET_ATS via the netlink interface. The payload contains 'historical bytes' that help to distinguish cards from one another. The information is commonly used to assemble an emulated ATR similar to that reported by smart cards with contacts. Add a 20-byte field target_ats to struct nci_dev to hold the payload obtained in nci_rf_intf_activated_ntf_packet() and copy it to over to nfc_target.ats in nci_activate_target(). The approach is similar to the handling of 'general bytes' within ATR_RES. Replace the hard-coded size of rats_res within struct activation_params_nfca_poll_iso_dep by the equal constant NFC_ATS_MAXSIZE now defined in nfc.h Within NCI, the information corresponds to the 'RATS Response' activation parameter that omits the initial length byte TL. This loses no information and is consistent with our handling of SENSB_RES that also drops the first (constant) byte. Tested with nxp_nci_i2c on a few type A targets including an ICAO 9303 compliant passport. I refrain from the corresponding change to digital_in_recv_ats() to have the few drivers based on digital.h fill nfc_target.ats, as I have no way to test it. That class of drivers appear not to set NFC_ATTR_TARGET_SENSB_RES either. Consider a separate patch to propagate (all) the parameters. Signed-off-by: Juraj Šarinay <juraj@sarinay.com> Link: https://patch.msgid.link/20241103124525.8392-1-juraj@sarinay.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-07drm/sched: Improve teardown documentationPhilipp Stanner
If jobs are still enqueued in struct drm_gpu_scheduler.pending_list when drm_sched_fini() gets called, those jobs will be leaked since that function stops both job-submission and (automatic) job-cleanup. It is, thus, up to the driver to take care of preventing leaks. The related function drm_sched_wqueue_stop() also prevents automatic job cleanup. Those pitfals are not reflected in the documentation, currently. Explicitly inform about the leak problem in the docstring of drm_sched_fini(). Additionally, detail the purpose of drm_sched_wqueue_{start,stop} and hint at the consequences for automatic cleanup. Signed-off-by: Philipp Stanner <pstanner@redhat.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241105143137.71893-2-pstanner@redhat.com
2024-11-07net: stmmac: Fix unbalanced IRQ wake disable warning on single irq caseNícolas F. R. A. Prado
Commit a23aa0404218 ("net: stmmac: ethtool: Fixed calltrace caused by unbalanced disable_irq_wake calls") introduced checks to prevent unbalanced enable and disable IRQ wake calls. However it only initialized the auxiliary variable on one of the paths, stmmac_request_irq_multi_msi(), missing the other, stmmac_request_irq_single(). Add the same initialization on stmmac_request_irq_single() to prevent "Unbalanced IRQ <x> wake disable" warnings from being printed the first time disable_irq_wake() is called on platforms that run on that code path. Fixes: a23aa0404218 ("net: stmmac: ethtool: Fixed calltrace caused by unbalanced disable_irq_wake calls") Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241101-stmmac-unbalanced-wake-single-fix-v1-1-5952524c97f0@collabora.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-07media: i2c: dw9768: Use runtime PM autosuspendZhi Mao
Use runtime PM autosuspend function to avoid rapid power state bouncing. Signed-off-by: Zhi Mao <zhi.mao@mediatek.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
2024-11-07media: i2c: ov5645: Switch to {enable,disable}_streamsLad Prabhakar
Switch from s_stream() to enable_streams() and disable_streams() pad operations. They are preferred and required for streams support. Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
2024-11-07media: i2c: ov5645: Use subdev active stateLad Prabhakar
Port the ov5645 sensor driver to use the subdev active state. Move all the format configuration to the subdevice state and simplify the format handling, locking and initialization. Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
2024-11-07media: i2c: ov5645: Drop `power_lock` mutexLad Prabhakar
Remove the `power_lock` mutex used during control applications, as it is only utilized in the .s_ctrl() function. Since the control framework already serializes calls to this function, the mutex is unnecessary. Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
2024-11-07media: i2c: ov5645: Use v4l2_async_register_subdev_sensor()Lad Prabhakar
Utilize the v4l2_async_register_subdev_sensor() helper to register the sub-device, as this facilitates parsing of firmware interfaces for remote references. Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
2024-11-07media: i2c: ov5645: Replace dev_err with dev_err_probe in probe functionLad Prabhakar
Refactor error handling in the ov5645_probe() function by replacing multiple dev_err() calls with dev_err_probe(). - Note that during this process, the error string "external clock frequency %u is not supported" was replaced with "unsupported xclk frequency %u" to ensure it wraps at 80 columns. - Additionally, the error string for control initialization failure was changed from "%s: control initialization error %d\n" to "failed to add controls\n" as there is no need to print the function name and error code in the string, since dev_err_probe() already provides this information. Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
2024-11-07media: i2c: ov5645: Use local `dev` pointer for subdev device assignmentLad Prabhakar
While assigning the subdev device pointer, use the local `dev` pointer which is already extracted from the `i2c_client` pointer. Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Reviewed-by: Tommaso Merciai <tomm.merciai@gmail.com> Tested-by: Tommaso Merciai <tomm.merciai@gmail.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
2024-11-07media: ipu6: make the ipu6_mmu_unmap() as a void functionBingbu Cao
The DMA unmap API is not supposed to return value. Thus this patch changes the ipu6_mmu_unmap() as a void function and DMA unmapping didn't check the return value. Signed-off-by: Bingbu Cao <bingbu.cao@intel.com> [Sakari Ailus: Drop unnecessary returns.] Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
2024-11-07media: ipu6: optimize the IPU6 MMU unmapping flowBingbu Cao
The MMU mapping flow is optimized for improve the performance, the unmapping flow could also be optimized to follow same flow. Signed-off-by: Bingbu Cao <bingbu.cao@intel.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
2024-11-07media: ipu6: optimize the IPU6 MMU mapping flowBingbu Cao
ipu6_mmu_map() operated on a per-page basis, it leads frequent spin_lock/unlock() and clflush_cache_range() for each page, it will cause inefficiencies especially when handling dma-bufs with large number of pages. However, the pages are likely concentrated pages by IOMMU DMA driver, IPU MMU driver can map the concentrated pages into less entries in l1 table. This change enhances ipu6_mmu_map() with batching process multiple contiguous pages. It significantly reduces calls for spin_lock/unlock and clflush_cache_range() and improve the performance. Signed-off-by: Bingbu Cao <bingbu.cao@intel.com> Signed-off-by: Jianhui Dai <jianhui.j.dai@intel.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
2024-11-07media: ipu6: move the l2_unmap() up before l2_map()Bingbu Cao
l2_map() and l2_unmap() are better to be grouped together. l2_unmap() will soon be called from l2_map() for mapping optimization. Signed-off-by: Bingbu Cao <bingbu.cao@intel.com> Signed-off-by: Jianhui Dai <jianhui.j.dai@intel.com> [Sakari Ailus: Rebase on debug print fixes on 32-bit.] Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
2024-11-07media: mc: Rename pad as origin in __media_pipeline_start()Sakari Ailus
Rename the pad field in __media_pipeline_start() to both better describe what it is and avoid masking it during the loop. Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
2024-11-07media: intel/ipu6: remove buttress ish structureStanislaw Gruszka
The buttress ipc ish structure is not effectively used on IPU6 - data is nullified on init. Remove the ish structure and handing of related interrupts to cleanup the code. Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>