summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-04-01nvme-multipath: change the NVME_MULTIPATH config optionJohn Meneghini
Fix up the NVME_MULTIPATH config description so that it accurately describes what it does. Signed-off-by: John Meneghini <jmeneghi@redhat.com> Tested-by: John Meneghini <jmeneghi@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-04-01nvme: update the multipath warning in nvme_init_ns_headJohn Meneghini
The new NVME_MULTIPATH_PARAM config option requires updates to the warning message in nvme_init_ns_head(). Signed-off-by: John Meneghini <jmeneghi@redhat.com> Tested-by: John Meneghini <jmeneghi@redhat.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-04-01nvme/ioctl: move fixed buffer lookup to nvme_uring_cmd_io()Caleb Sander Mateos
nvme_map_user_request() is called from both nvme_submit_user_cmd() and nvme_uring_cmd_io(). But the ioucmd branch is only applicable to nvme_uring_cmd_io(). Move it to nvme_uring_cmd_io() and just pass the resulting iov_iter to nvme_map_user_request(). For NVMe passthru operations with fixed buffers, the fixed buffer lookup happens in io_uring_cmd_import_fixed(). But nvme_uring_cmd_io() can return -EAGAIN first from nvme_alloc_user_request() if all tags in the tag set are in use. This ordering difference is observable when using UBLK_U_IO_{,UN}REGISTER_IO_BUF SQEs to modify the fixed buffer table. If the NVMe passthru operation is followed by UBLK_U_IO_UNREGISTER_IO_BUF to unregister the fixed buffer and the NVMe passthru goes async, the fixed buffer lookup will fail because it happens after the unregister. Userspace should not depend on the order in which io_uring issues SQEs submitted in parallel, but it may try submitting the SQEs together and fall back on a slow path if the fixed buffer lookup fails. To make the fast path more likely, do the import before nvme_alloc_user_request(). Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-04-01nvme/ioctl: move blk_mq_free_request() out of nvme_map_user_request()Caleb Sander Mateos
The callers of nvme_map_user_request() (nvme_submit_user_cmd() and nvme_uring_cmd_io()) allocate the request, so have them free it if nvme_map_user_request() fails. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-04-01nvme/ioctl: don't warn on vectorized uring_cmd with fixed bufferCaleb Sander Mateos
The vectorized io_uring NVMe passthru opcodes don't yet support fixed buffers. But since userspace can trigger this condition based on the io_uring SQE parameters, it shouldn't cause a kernel warning. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Fixes: 23fd22e55b76 ("nvme: wire up fixed buffer support for nvme passthrough") Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-04-01nvmet: pci-epf: Keep completion queues mappedDamien Le Moal
Instead of mapping and unmapping the completion queues memory to the host PCI address space whenever nvmet_pci_epf_cq_work() is called, map a completion queue to the host PCI address space when the completion queue is created with nvmet_pci_epf_create_cq() and unmap it when the completion queue is deleted with nvmet_pci_epf_delete_cq(). This removes the completion queue mapping/unmapping from nvmet_pci_epf_cq_work() and significantly increases performance. For a single job 4K random read QD=1 workload, the IOPS is increased from 23 KIOPS to 25 KIOPS. Some significant throughput increasde for high queue depth and large IOs workloads can also be seen. Since the functions nvmet_pci_epf_map_queue() and nvmet_pci_epf_unmap_queue() are called respectively only from nvmet_pci_epf_create_cq() and nvmet_pci_epf_delete_cq(), these functions are removed and open-coded in their respective call sites. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-04-01s390/vfio-ap: Fix no AP queue sharing allowed message written to kernel logAnthony Krowiak
An erroneous message is written to the kernel log when either of the following actions are taken by a user: 1. Assign an adapter or domain to a vfio_ap mediated device via its sysfs assign_adapter or assign_domain attributes that would result in one or more AP queues being assigned that are already assigned to a different mediated device. Sharing of queues between mdevs is not allowed. 2. Reserve an adapter or domain for the host device driver via the AP bus driver's sysfs apmask or aqmask attribute that would result in providing host access to an AP queue that is in use by a vfio_ap mediated device. Reserving a queue for a host driver that is in use by an mdev is not allowed. In both cases, the assignment will return an error; however, a message like the following is written to the kernel log: vfio_ap_mdev e1839397-51a0-4e3c-91e0-c3b9c3d3047d: Userspace may not re-assign queue 00.0028 already assigned to \ e1839397-51a0-4e3c-91e0-c3b9c3d3047d Notice the mdev reporting the error is the same as the mdev identified in the message as the one to which the queue is being assigned. It is perfectly okay to assign a queue to an mdev to which it is already assigned; the assignment is simply ignored by the vfio_ap device driver. This patch logs more descriptive and accurate messages for both 1 and 2 above to the kernel log: Example for 1: vfio_ap_mdev 0fe903a0-a323-44db-9daf-134c68627d61: Userspace may not assign queue 00.0033 to mdev: already assigned to \ 62177883-f1bb-47f0-914d-32a22e3a8804 Example for 2: vfio_ap_mdev 62177883-f1bb-47f0-914d-32a22e3a8804: Can not reserve queue 00.0033 for host driver: in use by mdev Signed-off-by: Anthony Krowiak <akrowiak@linux.ibm.com> Link: https://lore.kernel.org/r/20250311103304.1539188-1-akrowiak@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-04-01spi: cadence-qspi: revert "Improve spi memory performance"Miquel Raynal
During the v6.14-rc cycles, there has been an issue with syscons which prevented TI chipid controller to probe, itself preventing the only DMA engine on AM62A with the memcpy capability to probe, which is needed for the SPI controller to work in its most efficient configuration. The SPI controller on AM62A can be used in DAC and INDAC mode, which are some kind of direct mapping vs. CPU-controlled SPI operations, respectively. However, because of hardware constraints (some kind of request line not being driven), INDAC mode cannot leverage DMA without risking to underflow the SPI FIFO. This mode costs a lot of CPU cycles. On the other side however, DAC mode can be used with and without DMA support, but in practice, DMA transfers are way more efficient because of the burst capabilities that the CPU does not have. As a result, in terms of read throughput, using a Winbond chip in 1-8-8 SDR mode, we get: - 3.5MiB/s in DAC mode without DMA - 9.0MiB/s in INDAC mode (CPU more busy) - a fluctuating 9 to 12MiB/s in DAC mode with DMA (a constant 14.5MiB/s without CPUfreq) The reason for the patch that is being reverted is that because of the syscon issue, we were using a very un-efficient DAC configuration (no DMA), but since: commit 5728c92ae112 ("mfd: syscon: Restore device_node_to_regmap() for non-syscon nodes") the probing of the DMA controller has been fixed, and the performances are back to normal, so we can safely revert this commit. This is a revert of: commit cce2200dacd6 ("spi: cadence-qspi: Improve spi memory performance") Suggested-by: Vignesh Raghavendra <vigneshr@ti.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://patch.msgid.link/20250401134748.242846-1-miquel.raynal@bootlin.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-04-01rtc: remove 'setdate' test programWolfram Sang
The tool is not embedded in the testing framework. 'rtc' from rtc-tools serves as a much better programming example. No need to carry this tool in the kernel tree. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Link: https://lore.kernel.org/r/20250320103433.11673-1-wsa+renesas@sang-engineering.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2025-04-01block: remove unused nseg parameterNitesh Shetty
We are no longer using nr_segs, after blk_mq_attempt_bio_merge was moved out of blk_mq_get_new_request. Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com> Link: https://lore.kernel.org/r/20250401044348.15588-1-nj.shetty@samsung.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-01arm64: Don't call NULL in do_compat_alignment_fixup()Angelos Oikonomopoulos
do_alignment_t32_to_handler() only fixes up alignment faults for specific instructions; it returns NULL otherwise (e.g. LDREX). When that's the case, signal to the caller that it needs to proceed with the regular alignment fault handling (i.e. SIGBUS). Without this patch, the kernel panics: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Mem abort info: ESR = 0x0000000086000006 EC = 0x21: IABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x06: level 2 translation fault user pgtable: 4k pages, 48-bit VAs, pgdp=00000800164aa000 [0000000000000000] pgd=0800081fdbd22003, p4d=0800081fdbd22003, pud=08000815d51c6003, pmd=0000000000000000 Internal error: Oops: 0000000086000006 [#1] SMP Modules linked in: cfg80211 rfkill xt_nat xt_tcpudp xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo xt_addrtype nft_compat br_netfilter veth nvme_fa> libcrc32c crc32c_generic raid0 multipath linear dm_mod dax raid1 md_mod xhci_pci nvme xhci_hcd nvme_core t10_pi usbcore igb crc64_rocksoft crc64 crc_t10dif crct10dif_generic crct10dif_ce crct10dif_common usb_common i2c_algo_bit i2c> CPU: 2 PID: 3932954 Comm: WPEWebProcess Not tainted 6.1.0-31-arm64 #1 Debian 6.1.128-1 Hardware name: GIGABYTE MP32-AR1-00/MP32-AR1-00, BIOS F18v (SCP: 1.08.20211002) 12/01/2021 pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : 0x0 lr : do_compat_alignment_fixup+0xd8/0x3dc sp : ffff80000f973dd0 x29: ffff80000f973dd0 x28: ffff081b42526180 x27: 0000000000000000 x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 x23: 0000000000000004 x22: 0000000000000000 x21: 0000000000000001 x20: 00000000e8551f00 x19: ffff80000f973eb0 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: 0000000000000000 x9 : ffffaebc949bc488 x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000 x5 : 0000000000400000 x4 : 0000fffffffffffe x3 : 0000000000000000 x2 : ffff80000f973eb0 x1 : 00000000e8551f00 x0 : 0000000000000001 Call trace: 0x0 do_alignment_fault+0x40/0x50 do_mem_abort+0x4c/0xa0 el0_da+0x48/0xf0 el0t_32_sync_handler+0x110/0x140 el0t_32_sync+0x190/0x194 Code: bad PC value ---[ end trace 0000000000000000 ]--- Signed-off-by: Angelos Oikonomopoulos <angelos@igalia.com> Fixes: 3fc24ef32d3b ("arm64: compat: Implement misalignment fixups for multiword loads") Cc: <stable@vger.kernel.org> # 6.1.x Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20250401085150.148313-1-angelos@igalia.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-04-01selftest: rtc: skip some tests if the alarm only supports minutesWolfram Sang
There are alarms which have only minute-granularity. The RTC core already has a flag to describe them. Use this flag to skip tests which require the alarm to support seconds. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Link: https://lore.kernel.org/r/20250218101548.6514-1-wsa+renesas@sang-engineering.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2025-04-01MAINTAINERS: consistently use my dedicated email addressThomas Weißschuh
I use a dedicated address for kernel development. Unfortunately at some point I used another address and later copied it around to other places. Consistently use the dedicated address everywhere. As the old address does in fact work, an update to mailmap is not necessary. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20250331-email-correction-v1-1-4c0e92862202@weissschuh.net Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-04-01platform/x86: ISST: Correct command storage data lengthSrinivas Pandruvada
After resume/online turbo limit ratio (TRL) is restored partially if the admin explicitly changed TRL from user space. A hash table is used to store SST mail box and MSR settings when modified to restore those settings after resume or online. This uses a struct isst_cmd field "data" to store these settings. This is a 64 bit field. But isst_store_new_cmd() is only assigning as u32. This results in truncation of 32 bits. Change the argument to u64 from u32. Fixes: f607874f35cb ("platform/x86: ISST: Restore state on resume") Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250328224749.2691272-1-srinivas.pandruvada@linux.intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-04-01platform/x86: thinkpad_acpi: disable ACPI fan access for T495* and E560Eduard Christian Dumitrescu
T495, T495s, and E560 laptops have the FANG+FANW ACPI methods (therefore fang_handle and fanw_handle are not NULL) but they do not actually work, which results in a "No such device or address" error. The DSDT table code for the FANG+FANW methods doesn't seem to do anything special regarding the fan being secondary. The bug was introduced in commit 57d0557dfa49 ("platform/x86: thinkpad_acpi: Add Thinkpad Edge E531 fan support"), which added a new fan control method via the FANG+FANW ACPI methods. Add a quirk for T495, T495s, and E560 to avoid the FANG+FANW methods. Fan access and control is restored after forcing the legacy non-ACPI fan control method by setting both fang_handle and fanw_handle to NULL. Reported-by: Vlastimil Holer <vlastimil.holer@gmail.com> Fixes: 57d0557dfa49 ("platform/x86: thinkpad_acpi: Add Thinkpad Edge E531 fan support") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219643 Cc: stable@vger.kernel.org Tested-by: Alireza Elikahi <scr0lll0ck1s4b0v3h0m3k3y@gmail.com> Reviewed-by: Kurt Borja <kuurtb@gmail.com> Signed-off-by: Eduard Christian Dumitrescu <eduard.c.dumitrescu@gmail.com> Co-developed-by: Seyediman Seyedarab <ImanDevel@gmail.com> Signed-off-by: Seyediman Seyedarab <ImanDevel@gmail.com> Link: https://lore.kernel.org/r/20250324152442.106113-1-ImanDevel@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-04-01platform/x86: thinkpad_acpi: Fix NULL pointer dereferences while probingKurt Borja
Some subdrivers make use of the global reference tpacpi_pdev during initialization, which is called from the platform driver's probe. However, after the commit 38b9ab80db31 ("platform/x86: thinkpad_acpi: Move subdriver initialization to tpacpi_pdriver's probe.") this variable is only properly initialized *after* probing and this can result in a NULL pointer dereference. In order to fix this without reverting the commit, register the platform bundle in two steps, first create and initialize tpacpi_pdev, then register the driver synchronously with platform_driver_probe(). This way the benefits of commit 38b9ab80db31 are preserved. Additionally, the commit 43fc63a1e8f6 ("platform/x86: thinkpad_acpi: Move HWMON initialization to tpacpi_hwmon_pdriver's probe") introduced a similar problem, however tpacpi_sensors_pdev is only used once inside the probe, so replace the global reference with the one given by the probe. Reported-by: Damian Tometzki <damian@riscv-rocks.de> Closes: https://lore.kernel.org/r/CAL=B37kdL1orSQZD2A3skDOevRXBzF__cJJgY_GFh9LZO3FMsw@mail.gmail.com/ Fixes: 38b9ab80db31 ("platform/x86: thinkpad_acpi: Move subdriver initialization to tpacpi_pdriver's probe.") Fixes: 43fc63a1e8f6 ("platform/x86: thinkpad_acpi: Move HWMON initialization to tpacpi_hwmon_pdriver's probe") Tested-by: Damian Tometzki <damian@riscv-rocks.de> Tested-by: Gene C <arch@sapience.com> Signed-off-by: Kurt Borja <kuurtb@gmail.com> Link: https://lore.kernel.org/r/20250330-thinkpad-fix-v1-1-4906b3fe6b74@gmail.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2025-04-01rtc: mt6397: drop unused definesAlexandre Belloni
RTC_NUM_YEARS has never been used, the other defines are not used since commit 34bbdc12d04e ("rtc: mt6359: Add RTC hardware range and add support for start-year") Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20250303223637.1135362-1-alexandre.belloni@bootlin.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2025-04-01rtc: pcf85063: replace dev_err+return with return dev_err_probeMaud Spierings
Replace the dev_err plus return combo with return dev_err_probe() this actually communicates the error type when it occurs and helps debugging hardware issues. Signed-off-by: Maud Spierings <maudspierings@gocontroll.com> Link: https://lore.kernel.org/r/20250304-rtc_dev_err_probe-v1-1-9dcc042ad17e@gocontroll.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2025-04-01cifs: Do not add FILE_READ_ATTRIBUTES when using GENERIC_READ/EXECUTE/ALLPali Rohár
Individual bits GENERIC_READ, GENERIC_EXECUTE and GENERIC_ALL have meaning which includes also access right for FILE_READ_ATTRIBUTES. So specifying FILE_READ_ATTRIBUTES bit together with one of those GENERIC (except GENERIC_WRITE) does not do anything. This change prevents calling additional (fallback) code and sending more requests without FILE_READ_ATTRIBUTES when the primary request fails on -EACCES, as it is not needed at all. Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-04-01cifs: Improve SMB2+ stat() to work also without FILE_READ_ATTRIBUTESPali Rohár
If SMB2_OP_QUERY_INFO (called when POSIX extensions are not used) failed with STATUS_ACCESS_DENIED then it means that caller does not have permission to open the path with FILE_READ_ATTRIBUTES access and therefore cannot issue SMB2_OP_QUERY_INFO command. This will result in the -EACCES error from stat() sycall. There is an alternative way how to query limited information about path but still suitable for stat() syscall. SMB2 OPEN/CREATE operation returns in its successful response subset of query information. So try to open the path without FILE_READ_ATTRIBUTES but with MAXIMUM_ALLOWED access which will grant the maximum possible access to the file and the response will contain required query information for stat() syscall. This will improve smb2_query_path_info() to query also files which do not grant FILE_READ_ATTRIBUTES access to caller. Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-04-01cifs: Add fallback for SMB2 CREATE without FILE_READ_ATTRIBUTESPali Rohár
Some operations, like WRITE, does not require FILE_READ_ATTRIBUTES access. So when FILE_READ_ATTRIBUTES is not explicitly requested for smb2_open_file() then first try to do SMB2 CREATE with FILE_READ_ATTRIBUTES access (like it was before) and then fallback to SMB2 CREATE without FILE_READ_ATTRIBUTES access (less common case). This change allows to complete WRITE operation to a file when it does not grant FILE_READ_ATTRIBUTES permission and its parent directory does not grant READ_DATA permission (parent directory READ_DATA is implicit grant of child FILE_READ_ATTRIBUTES permission). Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-04-01rtc: pcf85063: do a SW reset if POR failedLukas Stockmann
Power-on Reset has a documented issue in PCF85063, refer to its datasheet, section "Software reset": "There is a low probability that some devices will have corruption of the registers after the automatic power-on reset if the device is powered up with a residual VDD level. It is required that the VDD starts at zero volts at power up or upon power cycling to ensure that there is no corruption of the registers. If this is not possible, a reset must be initiated after power-up (i.e. when power is stable) with the software reset command" Trigger SW reset if there is an indication that POR has failed. Link: https://www.nxp.com/docs/en/data-sheet/PCF85063A.pdf Signed-off-by: Lukas Stockmann <lukas.stockmann@siemens.com> Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com> Link: https://lore.kernel.org/r/20250120093451.30778-1-alexander.sverdlin@siemens.com Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
2025-04-01x86/mm/init: Handle the special case of device private pages in add_pages(), ↵Balbir Singh
to not increase max_pfn and trigger dma_addressing_limited() bounce buffers As Bert Karwatzki reported, the following recent commit causes a performance regression on AMD iGPU and dGPU systems: 7ffb791423c7 ("x86/kaslr: Reduce KASLR entropy on most x86 systems") It exposed a bug with nokaslr and zone device interaction. The root cause of the bug is that, the GPU driver registers a zone device private memory region. When KASLR is disabled or the above commit is applied, the direct_map_physmem_end is set to much higher than 10 TiB typically to the 64TiB address. When zone device private memory is added to the system via add_pages(), it bumps up the max_pfn to the same value. This causes dma_addressing_limited() to return true, since the device cannot address memory all the way up to max_pfn. This caused a regression for games played on the iGPU, as it resulted in the DMA32 zone being used for GPU allocations. Fix this by not bumping up max_pfn on x86 systems, when pgmap is passed into add_pages(). The presence of pgmap is used to determine if device private memory is being added via add_pages(). More details: devm_request_mem_region() and request_free_mem_region() request for device private memory. iomem_resource is passed as the base resource with start and end parameters. iomem_resource's end depends on several factors, including the platform and virtualization. On x86 for example on bare metal, this value is set to boot_cpu_data.x86_phys_bits. boot_cpu_data.x86_phys_bits can change depending on support for MKTME. By default it is set to the same as log2(direct_map_physmem_end) which is 46 to 52 bits depending on the number of levels in the page table. The allocation routines used iomem_resource's end and direct_map_physmem_end to figure out where to allocate the region. [ arch/powerpc is also impacted by this problem, but this patch does not fix the issue for PowerPC. ] Testing: 1. Tested on a virtual machine with test_hmm for zone device inseration 2. A previous version of this patch was tested by Bert, please see: https://lore.kernel.org/lkml/d87680bab997fdc9fb4e638983132af235d9a03a.camel@web.de/ [ mingo: Clarified the comments and the changelog. ] Reported-by: Bert Karwatzki <spasswolf@web.de> Tested-by: Bert Karwatzki <spasswolf@web.de> Fixes: 7ffb791423c7 ("x86/kaslr: Reduce KASLR entropy on most x86 systems") Signed-off-by: Balbir Singh <balbirs@nvidia.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Simona Vetter <simona@ffwll.ch> Link: https://lore.kernel.org/r/20250401000752.249348-1-balbirs@nvidia.com
2025-04-01objtool/loongarch: Add unwind hints in prepare_frametrace()Josh Poimboeuf
If 'regs' points to a local stack variable, prepare_frametrace() stores all registers to the stack. This confuses objtool as it expects them to be restored from the stack later. The stores don't affect stack tracing, so use unwind hints to hide them from objtool. Fixes the following warnings: arch/loongarch/kernel/traps.o: warning: objtool: show_stack+0xe0: stack state mismatch: reg1[22]=-1+0 reg2[22]=-2-160 arch/loongarch/kernel/traps.o: warning: objtool: show_stack+0xe0: stack state mismatch: reg1[23]=-1+0 reg2[23]=-2-152 Fixes: cb8a2ef0848c ("LoongArch: Add ORC stack unwinder support") Reported-by: kernel test robot <lkp@intel.com> Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/270cadd8040dda74db2307f23497bb68e65db98d.1743481539.git.jpoimboe@kernel.org Closes: https://lore.kernel.org/oe-kbuild-all/202503280703.OARM8SrY-lkp@intel.com/
2025-04-01rcu-tasks: Always inline rcu_irq_work_resched()Josh Poimboeuf
Thanks to CONFIG_DEBUG_SECTION_MISMATCH, empty functions can be generated out of line. rcu_irq_work_resched() can be called from noinstr code, so make sure it's always inlined. Fixes: 564506495ca9 ("rcu/context-tracking: Move deferred nocb resched to context tracking") Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/e84f15f013c07e4c410d972e75620c53b62c1b3e.1743481539.git.jpoimboe@kernel.org Closes: https://lore.kernel.org/d1eca076-fdde-484a-b33e-70e0d167c36d@infradead.org
2025-04-01context_tracking: Always inline ct_{nmi,irq}_{enter,exit}()Josh Poimboeuf
Thanks to CONFIG_DEBUG_SECTION_MISMATCH, empty functions can be generated out of line. These can be called from noinstr code, so make sure they're always inlined. Fixes the following warnings: vmlinux.o: warning: objtool: irqentry_nmi_enter+0xa2: call to ct_nmi_enter() leaves .noinstr.text section vmlinux.o: warning: objtool: irqentry_nmi_exit+0x16: call to ct_nmi_exit() leaves .noinstr.text section vmlinux.o: warning: objtool: irqentry_exit+0x78: call to ct_irq_exit() leaves .noinstr.text section Fixes: 6f0e6c1598b1 ("context_tracking: Take IRQ eqs entrypoints over RCU") Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/8509bce3f536bcd4ae7af3a2cf6930d48c5e631a.1743481539.git.jpoimboe@kernel.org Closes: https://lore.kernel.org/d1eca076-fdde-484a-b33e-70e0d167c36d@infradead.org
2025-04-01riscv: Add norvc after .option arch in runtime constCharlie Jenkins
.option arch clobbers .option norvc. Prevent gas from emitting compressed instructions in the runtime const alternative blocks by setting .option norvc after .option arch. This issue starts appearing on gcc 15, which adds zca to the march. Reported by: Klara Modin <klarasmodin@gmail.com> Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Fixes: a44fb5722199 ("riscv: Add runtime constant support") Closes: https://lore.kernel.org/all/cc8f3525-20b7-445b-877b-2add28a160a2@gmail.com/ Tested-by: Klara Modin <klarasmodin@gmail.com> Link: https://lore.kernel.org/r/20250331-fix_runtime_const_norvc-v1-1-89bc62687ab8@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-01sched/smt: Always inline sched_smt_active()Josh Poimboeuf
sched_smt_active() can be called from noinstr code, so it should always be inlined. The CONFIG_SCHED_SMT version already has __always_inline. Do the same for its !CONFIG_SCHED_SMT counterpart. Fixes the following warning: vmlinux.o: error: objtool: intel_idle_ibrs+0x13: call to sched_smt_active() leaves .noinstr.text section Fixes: 321a874a7ef8 ("sched/smt: Expose sched_smt_present static key") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/1d03907b0a247cf7fb5c1d518de378864f603060.1743481539.git.jpoimboe@kernel.org Closes: https://lore.kernel.org/r/202503311434.lyw2Tveh-lkp@intel.com/
2025-04-01riscv: Make sure toolchain supports zba before using zba instructionsAlexandre Ghiti
Old toolchain like gcc 8.5.0 does not support zba, so we must check that the toolchain supports this extension before using it in the kernel. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202503281836.8pntHm6I-lkp@intel.com/ Link: https://lore.kernel.org/r/20250328115422.253670-1-alexghiti@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-01objtool: Fix verbose disassembly if CROSS_COMPILE isn't setDavid Laight
In verbose mode, when printing the disassembly of affected functions, if CROSS_COMPILE isn't set, the objdump command string gets prefixed with "(null)". Somehow this worked before. Maybe some versions of glibc return an empty string instead of NULL. Fix it regardless. [ jpoimboe: Rewrite commit log. ] Fixes: ca653464dd097 ("objtool: Add verbose option for disassembling affected functions") Signed-off-by: David Laight <david.laight.linux@gmail.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250215142321.14081-1-david.laight.linux@gmail.com Link: https://lore.kernel.org/r/b931a4786bc0127aa4c94e8b35ed617dcbd3d3da.1743481539.git.jpoimboe@kernel.org
2025-04-01objtool: Change "warning:" to "error: " for fatal errorsJosh Poimboeuf
This is similar to GCC's behavior and makes it more obvious why the build failed. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/0ea76f4b0e7a370711ed9f75fd0792bb5979c2bf.1743481539.git.jpoimboe@kernel.org
2025-04-01objtool: Always fail on fatal errorsJosh Poimboeuf
Objtool writes several object annotations which are used to enable critical kernel runtime functionalities like static calls and retpoline/rethunk patching. In the rare case where it fails to read or write an object, the annotations don't get written, causing runtime code patching to fail and code to become corrupted. Due to the catastrophic nature of such warnings, convert them to errors which fail the build regardless of CONFIG_OBJTOOL_WERROR. Reported-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/7d35684ca61eac56eb2424f300ca43c5d257b170.1743481539.git.jpoimboe@kernel.org Closes: https://lore.kernel.org/SJ1PR11MB61295789E25C2F5197EFF2F6B9A72@SJ1PR11MB6129.namprd11.prod.outlook.com
2025-04-01Revert "objtool: Increase per-function WARN_FUNC() rate limit"Josh Poimboeuf
This reverts commit 0a7fb6f07e3ad497d31ae9a2082d2cacab43d54a. The "skipping duplicate warnings" warning is technically not an actual warning, which can cause confusion. This feature isn't all that useful anyway. It's exceedingly rare for a function to have more than one unrelated warning. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/e5abe5e858acf1a9207a5dfa0f37d17ac9dca872.1743481539.git.jpoimboe@kernel.org
2025-04-01riscv/purgatory: 4B align purgatory_startBjörn Töpel
When a crashkernel is launched on RISC-V, the entry to purgatory is done by trapping via the stvec CSR. From riscv_kexec_norelocate(): | ... | /* | * Switch to physical addressing | * This will also trigger a jump to CSR_STVEC | * which in this case is the address of the new | * kernel. | */ | csrw CSR_STVEC, a2 | csrw CSR_SATP, zero stvec requires that the address is 4B aligned, which was not the case, e.g.: | Loaded purgatory at 0xffffc000 | kexec_file: kexec_file_load: type:1, start:0xffffd232 head:0x4 flags:0x6 The address 0xffffd232 not 4B aligned. Correct by adding proper function alignment. With this change, crashkernels loaded with kexec-file will be able to properly enter the purgatory. Fixes: 736e30af583fb ("RISC-V: Add purgatory") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250328085313.1193815-1-bjorn@kernel.org Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-01objtool: Append "()" to function name in "unexpected end of section" warningJosh Poimboeuf
Append with "()" to clarify it's a function. Before: vmlinux.o: warning: objtool: cdns_mrvl_xspi_setup_clock: unexpected end of section .text.cdns_mrvl_xspi_setup_clock After: vmlinux.o: warning: objtool: cdns_mrvl_xspi_setup_clock(): unexpected end of section .text.cdns_mrvl_xspi_setup_clock Fixes: c5995abe1547 ("objtool: Improve error handling") Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/692e1e0d0b15a71bd35c6b4b87f3c75cd5a57358.1743481539.git.jpoimboe@kernel.org
2025-04-01riscv/kexec_file: Handle R_RISCV_64 in purgatory relocatorYao Zi
Commit 58ff537109ac ("riscv: Omit optimized string routines when using KASAN") introduced calls to EXPORT_SYMBOL() in assembly string routines, which result in R_RISCV_64 relocations against .export_symbol section. As these rountines are reused by RISC-V purgatory and our relocator doesn't recognize these relocations, this fails kexec-file-load with dmesg like [ 11.344251] kexec_image: Unknown rela relocation: 2 [ 11.345972] kexec_image: Error loading purgatory ret=-8 Let's support R_RISCV_64 relocation to fix kexec on 64-bit RISC-V. 32-bit variant isn't covered since KEXEC_FILE and KEXEC_PURGATORY isn't available. Fixes: 58ff537109ac ("riscv: Omit optimized string routines when using KASAN") Signed-off-by: Yao Zi <ziyao@disroot.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20250326051445.55131-2-ziyao@disroot.org Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-01objtool: Ignore end-of-section jumps for KCOV/GCOVJosh Poimboeuf
When KCOV or GCOV is enabled, dead code can be left behind, in which case objtool silences unreachable and undefined behavior (fallthrough) warnings. Fallthrough warnings, and their variant "end of section" warnings, were silenced with the following commit: 6b023c784204 ("objtool: Silence more KCOV warnings") Another variant of a fallthrough warning is a jump to the end of a function. If that function happens to be at the end of a section, the jump destination doesn't actually exist. Normally that would be a fatal objtool error, but for KCOV/GCOV it's just another undefined behavior fallthrough. Silence it like the others. Fixes the following warning: drivers/iommu/dma-iommu.o: warning: objtool: iommu_dma_sw_msi+0x92: can't find jump dest instruction at .text+0x54d5 Fixes: 6b023c784204 ("objtool: Silence more KCOV warnings") Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/08fbe7d7e1e20612206f1df253077b94f178d93e.1743481539.git.jpoimboe@kernel.org Closes: https://lore.kernel.org/314f8809-cd59-479b-97d7-49356bf1c8d1@infradead.org/
2025-04-01objtool: Silence more KCOV warnings, part 2Josh Poimboeuf
Similar to GCOV, KCOV can leave behind dead code and undefined behavior. Warnings related to those should be ignored. The previous commit: 6b023c784204 ("objtool: Silence more KCOV warnings") ... only did so for CONFIG_CGOV_KERNEL. Also do it for CONFIG_KCOV, but for real this time. Fixes the following warning: vmlinux.o: warning: objtool: synaptics_report_mt_data: unexpected end of section .text.synaptics_report_mt_data Fixes: 6b023c784204 ("objtool: Silence more KCOV warnings") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/a44ba16e194bcbc52c1cef3d3cd9051a62622723.1743481539.git.jpoimboe@kernel.org Closes: https://lore.kernel.org/oe-kbuild-all/202503282236.UhfRsF3B-lkp@intel.com/
2025-04-01Merge patch series "Add some validation for vector, vector crypto and fp stuff"Alexandre Ghiti
Conor Dooley <conor@kernel.org> says: From: Conor Dooley <conor.dooley@microchip.com> Yo, This series is partly leveraging Clement's work adding a validate callback in the extension detection code so that things like checking for whether a vector crypto extension is usable can be done like: has_extension(<vector crypto>) rather than has_vector() && has_extension(<vector crypto>) which Eric pointed out was a poor design some months ago. The rest of this is adding some requirements to the bindings that prevent combinations of extensions disallowed by the ISA. There's a bunch of over-long lines in here, but I thought that the over-long lines were clearer than breaking them up. Cheers, Conor. * patches from https://lore.kernel.org/r/20250312-abide-pancreas-3576b8c44d2c@spud: dt-bindings: riscv: document vector crypto requirements dt-bindings: riscv: add vector sub-extension dependencies dt-bindings: riscv: d requires f RISC-V: add f & d extension validation checks RISC-V: add vector crypto extension validation checks RISC-V: add vector extension validation checks Link: https://lore.kernel.org/r/20250312-abide-pancreas-3576b8c44d2c@spud Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-01selftests: riscv: fix v_exec_initval_nolibc.cIgnacio Encinas
Vector registers are zero initialized by the kernel. Stop accepting "all ones" as a clean value. Note that this was not working as expected given that value == 0xff can be assumed to be always false by the compiler as value's range is [-128, 127]. Both GCC (-Wtype-limits) and clang (-Wtautological-constant-out-of-range-compare) warn about this. Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Tested-by: Charlie Jenkins <charlie@rivosinc.com> Signed-off-by: Ignacio Encinas <ignacio@iencinas.com> Link: https://lore.kernel.org/r/20250306-fix-v_exec_initval_nolibc-v2-1-97f9dc8a7faf@iencinas.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-01riscv: Fix hugetlb retrieval of number of ptes in case of !present pteAlexandre Ghiti
Ryan sent a fix [1] for arm64 that applies to riscv too: in some hugetlb functions, we must not use the pte value to get the size of a mapping because the pte may not be present. So use the already present size parameter for huge_pte_clear() and the newly introduced size parameter for huge_ptep_get_and_clear(). And make sure to gather A/D bits only on present ptes. Fixes: 82a1a1f3bfb6 ("riscv: mm: support Svnapot in hugetlb page") Link: https://lore.kernel.org/all/20250217140419.1702389-1-ryan.roberts@arm.com/ [1] Link: https://lore.kernel.org/r/20250317072551.572169-1-alexghiti@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-01riscv: print hartid on bringupYunhui Cui
Firmware randomly releases cores, so CPU numbers don't linearly map to hartids. When the system has an exception, we care more about hartids. Adding "dyndbg="file smpboot.c +p" loglevel=8" to the cmdline can output the hartid. Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250303083424.14309-1-cuiyunhui@bytedance.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-04-01cifs: Fix querying and creating MF symlinks over SMB1Pali Rohár
Old SMB1 servers without CAP_NT_SMBS do not support CIFS_open() function and instead SMBLegacyOpen() needs to be used. This logic is already handled in cifs_open_file() function, which is server->ops->open callback function. So for querying and creating MF symlinks use open callback function instead of CIFS_open() function directly. This change fixes querying and creating new MF symlinks on Windows 98. Currently cifs_query_mf_symlink() is not able to detect MF symlink and cifs_create_mf_symlink() is failing with EIO error. Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-04-01cifs: Fix access_flags_to_smbopen_modePali Rohár
When converting access_flags to SMBOPEN mode, check for all possible access flags, not only GENERIC_READ and GENERIC_WRITE flags. Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-04-01cifs: Fix negotiate retry functionalityPali Rohár
SMB negotiate retry functionality in cifs_negotiate() is currently broken and does not work when doing socket reconnect. Caller of this function, which is cifs_negotiate_protocol() requires that tcpStatus after successful execution of negotiate callback stay in CifsInNegotiate. But if the CIFSSMBNegotiate() called from cifs_negotiate() fails due to connection issues then tcpStatus is changed as so repeated CIFSSMBNegotiate() call does not help. Fix this problem by moving retrying code from negotiate callback (which is either cifs_negotiate() or smb2_negotiate()) to cifs_negotiate_protocol() which is caller of those callbacks. This allows to properly handle and implement correct transistions between tcpStatus states as function cifs_negotiate_protocol() already handles it. With this change, cifs_negotiate_protocol() now handles also -EAGAIN error set by the RFC1002_NEGATIVE_SESSION_RESPONSE processing after reconnecting with NetBIOS session. Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-04-01cifs: Improve handling of NetBIOS packetsPali Rohár
Now all NetBIOS session logic is handled in ip_rfc1001_connect() function, so cleanup is_smb_response() function which contains generic handling of incoming SMB packets. Note that function is_smb_response() is not used directly or indirectly (e.g. over cifs_demultiplex_thread() by ip_rfc1001_connect() function. Except the Negative Session Response and the Session Keep Alive packet, the cifs_demultiplex_thread() should not receive any NetBIOS session packets. And Session Keep Alive packet may be received only when the NetBIOS session was established by ip_rfc1001_connect() function. So treat any such packet as error and schedule reconnect. Negative Session Response packet is returned from Windows SMB server (from Windows 98 and also from Windows Server 2022) if client sent over port 139 SMB negotiate request without previously establishing a NetBIOS session. The common scenario is that Negative Session Response packet is returned for the SMB negotiate packet, which is the first one which SMB client sends (if it is not establishing a NetBIOS session). Note that server port 139 may be forwarded and mapped between virtual machines to different number. And Linux SMB client do not call function ip_rfc1001_connect() when prot is not 139. So nowadays when using port mapping or port forwarding between VMs, it is not so uncommon to see this error. Currently the logic on Negative Session Response packet changes server port to 445 and force reconnection. But this logic does not work when using non-standard port numbers and also does not help if the server on specified port is requiring establishing a NetBIOS session. Fix this Negative Session Response logic and instead of changing server port (on which server does not have to listen), force reconnection with establishing a NetBIOS session. Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-04-01cifs: Allow to disable or force initialization of NetBIOS sessionPali Rohár
Currently SMB client always tries to initialize NetBIOS session when the server port is 139. This is useful for default cases, but nowadays when using non-standard routing or testing between VMs, it is common that servers are listening on non-standard ports. So add a new mount option -o nbsessinit and -o nonbsessinit which either forces initialization or disables initialization regardless of server port number. This allows Linux SMB client to connect to older SMB1 server listening on non-standard port, which requires initialization of NetBIOS session, by using additional mount options -o port= and -o nbsessinit. Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-04-01cifs: Add a new xattr system.smb3_ntsd_owner for getting or setting ownerPali Rohár
Changing owner is controlled by DACL permission WRITE_OWNER. Changing DACL itself is controlled by DACL permisssion WRITE_DAC. Owner of the file has implicit WRITE_DAC permission even when it is not explicitly granted for owner by DACL. Reading DACL or owner is controlled only by one permission READ_CONTROL. WRITE_OWNER permission can be bypassed by the SeTakeOwnershipPrivilege, which is by default available for local administrators. So if the local administrator wants to access some file to which does not have access, it is required to first change owner to ourself and then change DACL permissions. Currently Linux SMB client does not support this because client does not provide a way to change owner without touching DACL permissions. Fix this problem by introducing a new xattr "system.smb3_ntsd_owner" for setting/changing only owner part of the security descriptor. Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-04-01cifs: Add a new xattr system.smb3_ntsd_sacl for getting or setting SACLsPali Rohár
Access to SACL part of SMB security descriptor is granted by SACL privilege which by default is accessible only for local administrator. But it can be granted to any other user by local GPO or AD. SACL access is not granted by DACL permissions and therefore is it possible that some user would not have access to DACLs of some file, but would have access to SACLs of all files. So it means that for accessing SACLs (either getting or setting) in some cases requires not touching or asking for DACLs. Currently Linux SMB client does not allow to get or set SACLs without touching DACLs. Which means that user without DACL access is not able to get or set SACLs even if it has access to SACLs. Fix this problem by introducing a new xattr "system.smb3_ntsd_sacl" for accessing only SACLs part of the security descriptor (therefore without DACLs and OWNER/GROUP). Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-04-01bcachefs: fix ref leak in btree_node_read_all_replicasKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>