summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-03-20nvmet: Remove duplicate uuid_copyMike Christie
We do uuid_copy twice in nvmet_alloc_ctrl so this patch deletes one of the calls. Signed-off-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-20nvme: zns: Simplify nvme_zone_parse_entry()Damien Le Moal
Instead of passing a pointer to a struct nvme_ctrl and a pointer to a struct nvme_ns_head as the first two arguments of nvme_zone_parse_entry(), pass only a pointer to a struct nvme_ns as both the controller structure and ns head structure can be infered from the namespace structure. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-20nvmet: pci-epf: Remove redundant 'flush_workqueue()' callsChen Ni
'destroy_workqueue()' already drains the queue before destroying it, so there is no need to flush it explicitly. Remove the redundant 'flush_workqueue()' calls. This was generated with coccinelle: @@ expression E; @@ - flush_workqueue(E); destroy_workqueue(E); Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-20nvmet-fc: Remove unused functionsWangYuli
The functions nvmet_fc_iodnum() and nvmet_fc_fodnum() are currently unutilized. Following commit c53432030d86 ("nvme-fabrics: Add target support for FC transport"), which introduced these two functions, they have not been used at all in practice. Remove them to resolve the compiler warnings. Fix follow errors with clang-19 when W=1e: drivers/nvme/target/fc.c:177:1: error: unused function 'nvmet_fc_iodnum' [-Werror,-Wunused-function] 177 | nvmet_fc_iodnum(struct nvmet_fc_ls_iod *iodptr) | ^~~~~~~~~~~~~~~ drivers/nvme/target/fc.c:183:1: error: unused function 'nvmet_fc_fodnum' [-Werror,-Wunused-function] 183 | nvmet_fc_fodnum(struct nvmet_fc_fcp_iod *fodptr) | ^~~~~~~~~~~~~~~ 2 errors generated. make[8]: *** [scripts/Makefile.build:207: drivers/nvme/target/fc.o] Error 1 make[7]: *** [scripts/Makefile.build:465: drivers/nvme/target] Error 2 make[6]: *** [scripts/Makefile.build:465: drivers/nvme] Error 2 make[6]: *** Waiting for unfinished jobs.... Fixes: c53432030d86 ("nvme-fabrics: Add target support for FC transport") Signed-off-by: WangYuli <wangyuli@uniontech.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-20nvme-pci: remove stale commentBaruch Siach
The ns variable has been removed in commit 62451a2b2e7e ("nvme: separate command prep and issue"). Drop reference to ns in comment. Fixes: 62451a2b2e7e ("nvme: separate command prep and issue") Signed-off-by: Baruch Siach <baruch@tkos.co.il> Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-20nvme-fc: Utilise min3() to simplify queue count calculationQasim Ijaz
Refactor nvme_fc_create_io_queues() and nvme_fc_recreate_io_queues() to use the min3() macro to find the minimum between 3 values instead of multiple min()'s. This shortens the code and makes it easier to read. Signed-off-by: Qasim Ijaz <qasdev00@gmail.com> Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-20nvme-multipath: Add visibility for queue-depth io-policyNilay Shroff
This patch helps add nvme native multipath visibility for queue-depth io-policy. It adds a new attribute file named "queue_depth" under namespace device path node which would print the number of active/ in-flight I/O requests currently queued for the given path. For instance, if we have a shared namespace accessible from two different controllers/paths then accessing head block node of the shared namespace would show the following output: $ ls -l /sys/block/nvme1n1/multipath/ nvme1c1n1 -> ../../../../../pci052e:78/052e:78:00.0/nvme/nvme1/nvme1c1n1 nvme1c3n1 -> ../../../../../pci058e:78/058e:78:00.0/nvme/nvme3/nvme1c3n1 In the above example, nvme1n1 is head gendisk node created for a shared namespace and the namespace is accessible from nvme1c1n1 and nvme1c3n1 paths. For queue-depth io-policy we can then refer the "queue_depth" attribute file created under each namespace path: $ cat /sys/block/nvme1n1/multipath/nvme1c1n1/queue_depth 518 $cat /sys/block/nvme1n1/multipath/nvme1c3n1/queue_depth 504 >From the above output, we can infer that I/O workload targeted at nvme1n1 uses two paths nvme1c1n1 and nvme1c3n1 and the current queue depth of each path is 518 and 504 respectively. Reading "queue_depth" file when configured io-policy is anything but queue-depth would show no output. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-20nvme-multipath: Add visibility for numa io-policyNilay Shroff
This patch helps add nvme native multipath visibility for numa io-policy. It adds a new attribute file named "numa_nodes" under namespace gendisk device path node which prints the list of numa nodes preferred by the given namespace path. The numa nodes value is comma delimited list of nodes or A-B range of nodes. For instance, if we have a shared namespace accessible from two different controllers/paths then accessing head node of the shared namespace would show the following output: $ ls -l /sys/block/nvme1n1/multipath/ nvme1c1n1 -> ../../../../../pci052e:78/052e:78:00.0/nvme/nvme1/nvme1c1n1 nvme1c3n1 -> ../../../../../pci058e:78/058e:78:00.0/nvme/nvme3/nvme1c3n1 In the above example, nvme1n1 is head gendisk node created for a shared namespace and this namespace is accessible from nvme1c1n1 and nvme1c3n1 paths. For numa io-policy we can then refer the "numa_nodes" attribute file created under each namespace path: $ cat /sys/block/nvme1n1/multipath/nvme1c1n1/numa_nodes 0-1 $ cat /sys/block/nvme1n1/multipath/nvme1c3n1/numa_nodes 2-3 >From the above output, we infer that I/O workload targeted at nvme1n1 and running on numa nodes 0 and 1 would prefer using path nvme1c1n1. Similarly, I/O workload running on numa nodes 2 and 3 would prefer using path nvme1c3n1. Reading "numa_nodes" file when configured io-policy is anything but numa would show no output. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-20nvme-multipath: Add visibility for round-robin io-policyNilay Shroff
This patch helps add nvme native multipath visibility for round-robin io-policy. It creates a "multipath" sysfs directory under head gendisk device node directory and then from "multipath" directory it adds a link to each namespace path device the head node refers. For instance, if we have a shared namespace accessible from two different controllers/paths then we create a soft link to each path device from head disk node as shown below: $ ls -l /sys/block/nvme1n1/multipath/ nvme1c1n1 -> ../../../../../pci052e:78/052e:78:00.0/nvme/nvme1/nvme1c1n1 nvme1c3n1 -> ../../../../../pci058e:78/058e:78:00.0/nvme/nvme3/nvme1c3n1 In the above example, nvme1n1 is head gendisk node created for a shared namespace and the namespace is accessible from nvme1c1n1 and nvme1c3n1 paths. For round-robin I/O policy, we could easily infer from the above output that I/O workload targeted to nvme1n1 would toggle across paths nvme1c1n1 and nvme1c3n1. Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-20nvmet: add tls_concat and tls_key debugfs entriesHannes Reinecke
Add debugfs entries to display the 'concat' and 'tls_key' controller attributes. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-20nvmet-tcp: support secure channel concatenationHannes Reinecke
Evaluate the SC_C flag during DH-CHAP-HMAC negotiation to check if secure concatenation as specified in the NVMe Base Specification v2.1, section 8.3.4.3: "Secure Channel Concatenationand" is requested. If requested the generated PSK is inserted into the keyring once negotiation has finished allowing for an encrypted connection once the admin queue is restarted. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-20nvmet: Add 'sq' argument to alloc_ctrl_argsHannes Reinecke
For secure concatenation the result of the TLS handshake will be stored in the 'sq' struct, so add it to the alloc_ctrl_args struct. Cc: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-20nvme-fabrics: reset admin connection for secure concatenationHannes Reinecke
When secure concatenation is requested the connection needs to be reset to enable TLS encryption on the new cnnection. That implies that the original connection used for the DH-CHAP negotiation really shouldn't be used, and we should reset as soon as the DH-CHAP negotiation has succeeded on the admin queue. Based on an idea from Sagi. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-20nvme-tcp: request secure channel concatenationHannes Reinecke
Add a fabrics option 'concat' to request secure channel concatenation as specified the NVME Base Specification v2.1, section 8.3.4.3: Secure Channel Concatenation. When secure channel concatenation is enabled a 'generated PSK' is inserted into the keyring such that it's available after reset. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-20nvme-keyring: add nvme_tls_psk_refresh()Hannes Reinecke
Add a function to refresh a generated PSK in the specified keyring. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-20nvme: add nvme_auth_derive_tls_psk()Hannes Reinecke
Add a function to derive the TLS PSK as specified TP8018. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-20nvme: add nvme_auth_generate_digest()Hannes Reinecke
Add a function to calculate the PSK digest as specified in TP8018. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-20nvme: add nvme_auth_generate_psk()Hannes Reinecke
Add a function to generate a NVMe PSK from the shared credentials negotiated by DH-HMAC-CHAP. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-20crypto,fs: Separate out hkdf_extract() and hkdf_expand()Hannes Reinecke
Separate out the HKDF functions into a separate module to to make them available to other callers. And add a testsuite to the module with test vectors from RFC 5869 (and additional vectors for SHA384 and SHA512) to ensure the integrity of the algorithm. Signed-off-by: Hannes Reinecke <hare@kernel.org> Acked-by: Eric Biggers <ebiggers@kernel.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-03-20selftests: ublk: add variable for user to not show test resultMing Lei
Some user decides test result by exit code only, and wouldn't like to be bothered by the test result. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250320013743.4167489-4-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-20selftests: ublk: don't show `modprobe` failureMing Lei
ublk_drv may be built-in, so don't show modprobe failure, and we do check `/dev/ublk-control` for skipping test if ublk_drv isn't enabled. Reported-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250320013743.4167489-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-20selftests: ublk: add one dependency headerMing Lei
Add one dependency helper which can include new uapi definition which isn't synced from kernel. This way also helps a lot for downstream test deployment. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250320013743.4167489-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-20cpumask: align text in commentJoel Savitz
Since commit 4e1a7df45480 ("cpumask: Add enabled cpumask for present CPUs that can be brought online") introduced cpu_enabled_mask, the comment line describing the mask has been slightly out of alignment with the adjacent lines. Fix this by removing a single space character. Signed-off-by: Joel Savitz <jsavitz@redhat.com> Signed-off-by: Yury Norov <yury.norov@gmail.com>
2025-03-20hyperv: Add definitions for root partition driver to hv headersNuno Das Neves
A few additional definitions are required for the mshv driver code (to follow). Introduce those here and clean up a little bit while at it. Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Roman Kisel <romank@linux.microsoft.com> Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Link: https://lore.kernel.org/r/1741980536-3865-10-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1741980536-3865-10-git-send-email-nunodasneves@linux.microsoft.com>
2025-03-20x86: hyperv: Add mshv_handler() irq handler and setup functionNuno Das Neves
Add mshv_handler() to process messages related to managing guest partitions such as intercepts, doorbells, and scheduling messages. In a (non-nested) root partition, the same interrupt vector is shared between the vmbus and mshv_root drivers. Introduce a stub for mshv_handler() and call it in sysvec_hyperv_callback alongside vmbus_handler(). Even though both handlers will be called for every Hyper-V interrupt, the messages for each driver are delivered to different offsets within the SYNIC message page, so they won't step on each other. Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Wei Liu <wei.liu@kernel.org> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Link: https://lore.kernel.org/r/1741980536-3865-9-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1741980536-3865-9-git-send-email-nunodasneves@linux.microsoft.com>
2025-03-20Drivers: hv: Introduce per-cpu event ring tailNuno Das Neves
Add a pointer hv_synic_eventring_tail to track the tail pointer for the SynIC event ring buffer for each SINT. This will be used by the mshv driver, but must be tracked independently since the driver module could be removed and re-inserted. Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Wei Liu <wei.liu@kernel.org> Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Link: https://lore.kernel.org/r/1741980536-3865-8-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1741980536-3865-8-git-send-email-nunodasneves@linux.microsoft.com>
2025-03-20Drivers: hv: Export some functions for use by root partition moduleNuno Das Neves
hv_get_hypervisor_version(), hv_call_deposit_pages(), and hv_call_create_vp(), are all needed in-module with CONFIG_MSHV_ROOT=m. Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Stanislav Kinsburskii <skinsburskii@microsoft.linux.com> Reviewed-by: Roman Kisel <romank@linux.microsoft.com> Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Link: https://lore.kernel.org/r/1741980536-3865-7-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1741980536-3865-7-git-send-email-nunodasneves@linux.microsoft.com>
2025-03-20acpi: numa: Export node_to_pxm()Nuno Das Neves
node_to_pxm() is used by hv_numa_node_to_pxm_info(). That helper will be used by Hyper-V root partition module code when CONFIG_MSHV_ROOT=m. Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Link: https://lore.kernel.org/r/1741980536-3865-6-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1741980536-3865-6-git-send-email-nunodasneves@linux.microsoft.com>
2025-03-20hyperv: Introduce hv_recommend_using_aeoi()Nuno Das Neves
Factor out the check for enabling auto eoi, to be reused in root partition code. No functional changes. Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Link: https://lore.kernel.org/r/1741980536-3865-5-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1741980536-3865-5-git-send-email-nunodasneves@linux.microsoft.com>
2025-03-20arm64/hyperv: Add some missing functions to arm64Nuno Das Neves
These non-nested msr and fast hypercall functions are present in x86, but they must be available in both architectures for the root partition driver code. While at it, remove the redundant 'extern' keywords from the hv_do_hypercall() variants in asm-generic/mshyperv.h. Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Reviewed-by: Roman Kisel <romank@linux.microsoft.com> Link: https://lore.kernel.org/r/1741980536-3865-4-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1741980536-3865-4-git-send-email-nunodasneves@linux.microsoft.com>
2025-03-20x86/mshyperv: Add support for extended Hyper-V featuresStanislav Kinsburskii
Extend the "ms_hyperv_info" structure to include a new field, "ext_features", for capturing extended Hyper-V features. Update the "ms_hyperv_init_platform" function to retrieve these features using the cpuid instruction and include them in the informational output. Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com> Reviewed-by: Roman Kisel <romank@linux.microsoft.com> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/1741980536-3865-3-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1741980536-3865-3-git-send-email-nunodasneves@linux.microsoft.com>
2025-03-20hyperv: Log hypercall status codes as stringsNuno Das Neves
Introduce hv_status_printk() macros as a convenience to log hypercall errors, formatting them with the status code (HV_STATUS_*) as a raw hex value and also as a string, which saves some time while debugging. Create a table of HV_STATUS_ codes with strings and mapped errnos, and use it for hv_result_to_string() and hv_result_to_errno(). Use the new hv_status_printk()s in hv_proc.c, hyperv-iommu.c, and irqdomain.c hypercalls to aid debugging in the root partition. Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Link: https://lore.kernel.org/r/1741980536-3865-2-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1741980536-3865-2-git-send-email-nunodasneves@linux.microsoft.com>
2025-03-20x86/hyperv: Fix check of return value from snp_set_vmsa()Tianyu Lan
snp_set_vmsa() returns 0 as success result and so fix it. Cc: stable@vger.kernel.org Fixes: 44676bb9d566 ("x86/hyperv: Add smp support for SEV-SNP guest") Signed-off-by: Tianyu Lan <tiala@microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20250313085217.45483-1-ltykernel@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250313085217.45483-1-ltykernel@gmail.com>
2025-03-20x86/hyperv: Add VTL mode callback for restarting the systemRoman Kisel
The kernel runs as a firmware in the VTL mode, and the only way to restart in the VTL mode on x86 is to triple fault. Thus, one has to always supply "reboot=t" on the kernel command line in the VTL mode, and missing that renders rebooting not working. Define the machine restart callback to always use the triple fault to provide the robust configuration by default. Signed-off-by: Roman Kisel <romank@linux.microsoft.com> Link: https://lore.kernel.org/r/20250227214728.15672-3-romank@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250227214728.15672-3-romank@linux.microsoft.com>
2025-03-20x86/hyperv: Add VTL mode emergency restart callbackRoman Kisel
By default, X86(-64) systems use the emergecy restart routine in the course of which the code unconditionally writes to the physical address of 0x472 to indicate the boot mode to the firmware (BIOS or UEFI). When the kernel itself runs as a firmware in the VTL mode, that write corrupts the memory of the guest upon emergency restarting. Preserving the state intact in that situation is important for debugging, at least. Define the specialized machine callback to avoid that write and use the triple fault to perform emergency restart. Signed-off-by: Roman Kisel <romank@linux.microsoft.com> Link: https://lore.kernel.org/r/20250227214728.15672-2-romank@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250227214728.15672-2-romank@linux.microsoft.com>
2025-03-20hyperv: Remove unused union and structsThorsten Blum
The union vmpacket_largest_possible_header and several structs have not been used for a long time afaict - remove them. Reviewed-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Link: https://lore.kernel.org/r/20250311091634.494888-2-thorsten.blum@linux.dev Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250311091634.494888-2-thorsten.blum@linux.dev>
2025-03-20hyperv: Add CONFIG_MSHV_ROOT to gate root partition supportNuno Das Neves
CONFIG_MSHV_ROOT allows kernels built to run as a normal Hyper-V guest to exclude the root partition code, which is expected to grow significantly over time. This option is a tristate so future driver code can be built as a (m)odule, allowing faster development iteration cycles. If CONFIG_MSHV_ROOT is disabled, don't compile hv_proc.c, and stub hv_root_partition() to return false unconditionally. This allows the compiler to optimize away root partition code blocks since they will be disabled at compile time. In the case of booting as root partition *without* CONFIG_MSHV_ROOT enabled, print a critical error (the kernel will likely crash). Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/1740167795-13296-4-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1740167795-13296-4-git-send-email-nunodasneves@linux.microsoft.com>
2025-03-20Merge tag 'vfs-6.14-final.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: "A final set of fixes for this cycle: VFS: - Ensure that the stable offset api doesn't return duplicate directory entries when userspace has to perform the getdents call multiple times on large directories afs: - Prevent invalid pointer dereference during get_link RCU pathwalk fuse: - Fix deadlock caused by uninitialized rings when using io_uring with fuse - Handle race condition when using io_uring with fuse to prevent NULL dereference libnetfs: - Ensure that invalidate_cache is only called if implemented - Fix collection of results during pause when collection is offloaded - Ensure rolling_buffer_load_from_ra() doesn't clear mark bits - Make netfs_unbuffered_read() return ssize_t rather than int" * tag 'vfs-6.14-final.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: libfs: Fix duplicate directory entry in offset_dir_lookup fuse: fix possible deadlock if rings are never initialized netfs: Fix netfs_unbuffered_read() to return ssize_t rather than int netfs: Fix rolling_buffer_load_from_ra() to not clear mark bits netfs: Call `invalidate_cache` only if implemented netfs: Fix collection of results during pause when collection offloaded fuse: fix uring race condition for null dereference of fc afs: Fix afs_atcell_get_link() to check if ws_cell is unset first
2025-03-20Merge tag 'for-netdev' of ↵Paolo Abeni
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Martin KaFai Lau says: ==================== pull-request: bpf-next 2025-03-13 The following pull-request contains BPF updates for your *net-next* tree. We've added 4 non-merge commits during the last 3 day(s) which contain a total of 2 files changed, 35 insertions(+), 12 deletions(-). The main changes are: 1) bpf_getsockopt support for TCP_BPF_RTO_MIN and TCP_BPF_DELACK_MAX, from Jason Xing bpf-next-for-netdev * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: selftests/bpf: Add bpf_getsockopt() for TCP_BPF_DELACK_MAX and TCP_BPF_RTO_MIN tcp: bpf: Support bpf_getsockopt for TCP_BPF_DELACK_MAX tcp: bpf: Support bpf_getsockopt for TCP_BPF_RTO_MIN tcp: bpf: Introduce bpf_sol_tcp_getsockopt to support TCP_BPF flags ==================== Link: https://patch.msgid.link/20250313221620.2512684-1-martin.lau@linux.dev Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netPaolo Abeni
Cross-merge networking fixes after downstream PR (net-6.14-rc8). Conflict: tools/testing/selftests/net/Makefile 03544faad761 ("selftest: net: add proc_net_pktgen") 3ed61b8938c6 ("selftests: net: test for lwtunnel dst ref loops") tools/testing/selftests/net/config: 85cb3711acb8 ("selftests: net: Add test cases for link and peer netns") 3ed61b8938c6 ("selftests: net: test for lwtunnel dst ref loops") Adjacent commits: tools/testing/selftests/net/Makefile c935af429ec2 ("selftests: net: add support for testing SO_RCVMARK and SO_RCVPRIORITY") 355d940f4d5a ("Revert "selftests: Add IPv6 link-local address generation tests for GRE devices."") Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20perf/x86/rapl: Fix error handling in init_rapl_pmus()Dhananjay Ugwekar
If init_rapl_pmu() fails while allocating memory for "rapl_pmu" objects, we miss freeing the "rapl_pmus" object in the error path. Fix that. Fixes: 9b99d65c0bb4 ("perf/x86/rapl: Move the pmu allocation out of CPU hotplug") Signed-off-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250320100617.4480-1-dhananjay.ugwekar@amd.com
2025-03-20ext4: don't over-report free space or inodes in statvfsTheodore Ts'o
This fixes an analogus bug that was fixed in xfs in commit 4b8d867ca6e2 ("xfs: don't over-report free space or inodes in statvfs") where statfs can report misleading / incorrect information where project quota is enabled, and the free space is less than the remaining quota. This commit will resolve a test failure in generic/762 which tests for this bug. Cc: stable@kernel.org Fixes: 689c958cbe6b ("ext4: add project quota support") Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
2025-03-20ASoC: SOF: mediatek: Commonize duplicated functionsAngeloGioacchino Del Regno
In order to reduce duplication, move the ADSP mailbox callbacks handle_reply(), handle_request(), and other common SOF callbacks send_msg(), get_bar_index(), pcm_hw_params() and pcm_pointer() to the mtk-adsp-common.c file. This cleanup brings no functional differences. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patch.msgid.link/20250320115300.137410-1-angelogioacchino.delregno@collabora.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20ASoC: dmic: Fix NULL pointer dereferenceMario Limonciello
Regulator support was introduced in commit d3321a20b5111 ("ASoC: dmic: add regulator support"). During probe `dmic->vref` is initialized with devm_regulator_get_optional() but in the error flow doesn't get cleared in the case that PTR_ERR(dmic->vref) is -ENODEV. This leads to the following NULL pointer deref. ``` Oops: Oops: 0000 [#1] SMP NOPTI CPU: 7 UID: 1000 PID: 1587 Comm: wireplumber Not tainted 6.14.0-rc7-next-20250318 #1 PREEMPT(voluntary) RIP: 0010:regulator_enable+0x17/0x70 RSP: 0018:ffffcc10c1fe7a38 EFLAGS: 00010282 RAX: ffff8bccc1c25010 RBX: ffffffffffffffed RCX: 0000000000000000 RDX: 0000000000000002 RSI: ffffcc10c1fe7a38 RDI: ffffffffffffffed RBP: ffffcc10c1fe7a68 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8bcccd51f800 R13: ffffffffc1086e88 R14: 0000000000000001 R15: 0000000000000001 FS: 00007f927bc35800(0000) GS:ffff8bd44f09f000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000065 CR3: 00000001332c6000 CR4: 0000000000750ef0 PKRU: 55555554 Call Trace: <TASK> ? show_regs+0x6c/0x80 ? __die+0x24/0x80 ? page_fault_oops+0x154/0x570 ? hrtimer_start_range_ns+0x142/0x4e0 ? timerqueue_del+0x31/0x50 ? do_user_addr_fault+0x4ac/0x880 ? exc_page_fault+0x82/0x1d0 ? asm_exc_page_fault+0x27/0x30 ? regulator_enable+0x17/0x70 ? __schedule+0x491/0x16b0 dmic_aif_event+0x82/0xa0 [snd_soc_dmic] ``` Adjust the error flow to explicitly set it back to NULL to avoid calling regulator_enable() with garbage data. Reported-by: Akshata V Unkal <Akshata.VUnkal@amd.com> Fixes: d3321a20b5111 ("ASoC: dmic: add regulator support") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patch.msgid.link/20250319145636.2401680-1-superm1@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
2025-03-20Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fix from Paolo Bonzini: "A lone fix for a s390 regression. An earlier 6.14 commit stopped taking the pte lock for pages that are being converted to secure, but it was needed to avoid races. The patch was in development for a while and is finally ready, but I wish it was split into 3-4 commits at least" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: s390: pv: fix race when making a page secure
2025-03-20io_uring/net: don't clear REQ_F_NEED_CLEANUP unconditionallyJens Axboe
io_req_msg_cleanup() relies on the fact that io_netmsg_recycle() will always fully recycle, but that may not be the case if the msg cache was already full. To ensure that normal cleanup always gets run, let io_netmsg_recycle() deal with clearing the relevant cleanup flags, as it knows exactly when that should be done. Cc: stable@vger.kernel.org Reported-by: David Wei <dw@davidwei.uk> Fixes: 75191341785e ("io_uring/net: add iovec recycling") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-03-20Merge branch 'kvm-pre-tdx' into HEADPaolo Bonzini
- Add common secure TSC infrastructure for use within SNP and in the future TDX - Block KVM_CAP_SYNC_REGS if guest state is protected. It does not make sense to use the capability if the relevant registers are not available for reading or writing.
2025-03-20Merge branch 'kvm-nvmx-and-vm-teardown' into HEADPaolo Bonzini
The immediate issue being fixed here is a nVMX bug where KVM fails to detect that, after nested VM-Exit, L1 has a pending IRQ (or NMI). However, checking for a pending interrupt accesses the legacy PIC, and x86's kvm_arch_destroy_vm() currently frees the PIC before destroying vCPUs, i.e. checking for IRQs during the forced nested VM-Exit results in a NULL pointer deref; that's a prerequisite for the nVMX fix. The remaining patches attempt to bring a bit of sanity to x86's VM teardown code, which has accumulated a lot of cruft over the years. E.g. KVM currently unloads each vCPU's MMUs in a separate operation from destroying vCPUs, all because when guest SMP support was added, KVM had a kludgy MMU teardown flow that broke when a VM had more than one 1 vCPU. And that oddity lived on, for 18 years... Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-03-20drm/xe: Fix exporting xe buffers multiple timesTomasz Rusinowicz
The `struct ttm_resource->placement` contains TTM_PL_FLAG_* flags, but it was incorrectly tested for XE_PL_* flags. This caused xe_dma_buf_pin() to always fail when invoked for the second time. Fix this by checking the `mem_type` field instead. Fixes: 7764222d54b7 ("drm/xe: Disallow pinning dma-bufs in VRAM") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: intel-xe@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.8+ Signed-off-by: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250218100353.2137964-1-jacek.lawrynowicz@linux.intel.com Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> (cherry picked from commit b96dabdba9b95f71ded50a1c094ee244408b2a8e) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-03-20Merge tag 'kvmarm-6.15' of ↵Paolo Bonzini
https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 6.15 - Nested virtualization support for VGICv3, giving the nested hypervisor control of the VGIC hardware when running an L2 VM - Removal of 'late' nested virtualization feature register masking, making the supported feature set directly visible to userspace - Support for emulating FEAT_PMUv3 on Apple silicon, taking advantage of an IMPLEMENTATION DEFINED trap that covers all PMUv3 registers - Paravirtual interface for discovering the set of CPU implementations where a VM may run, addressing a longstanding issue of guest CPU errata awareness in big-little systems and cross-implementation VM migration - Userspace control of the registers responsible for identifying a particular CPU implementation (MIDR_EL1, REVIDR_EL1, AIDR_EL1), allowing VMs to be migrated cross-implementation - pKVM updates, including support for tracking stage-2 page table allocations in the protected hypervisor in the 'SecPageTable' stat - Fixes to vPMU, ensuring that userspace updates to the vPMU after KVM_RUN are reflected into the backing perf events