summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-08-11thermal: intel: intel_soc_dts_iosf: Use struct thermal_tripRafael J. Wysocki
Because the number of trip points in each thermal zone and their types are known to intel_soc_dts_iosf_init() prior to the registration of the thermal zones, make it create an array of struct thermal_trip entries in each struct intel_soc_dts_sensor_entry object and make add_dts_thermal_zone() use thermal_zone_device_register_with_trips() for thermal zone registration and pass that array as its second argument. Drop the sys_get_trip_temp() and sys_get_trip_type() callback functions along with the respective callback pointers in tzone_ops, because they are not necessary any more. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-08-11thermal: intel: intel_soc_dts_iosf: Rework critical trip setupRafael J. Wysocki
Critical trip points appear in the DTS thermal zones only after those thermal zones have been registered via intel_soc_dts_iosf_init(). Moreover, they are "created" by changing the type of an existing trip point from THERMAL_TRIP_PASSIVE to THERMAL_TRIP_CRITICAL via intel_soc_dts_iosf_add_read_only_critical_trip(), the caller of which has to be careful enough to pass at least 1 as the number of read-only trip points to intel_soc_dts_iosf_init() beforehand. This is questionable, because user space may have started to use the trips at the time when intel_soc_dts_iosf_add_read_only_critical_trip() runs and there is no synchronization between it and sys_set_trip_temp(). To address it, use the observation that nonzero number of read-only trip points is only passed to intel_soc_dts_iosf_init() when critical trip points are going to be used, so in fact that function may get all of the information regarding the critical trip points upfront and it can configure them before registering the corresponding thermal zones. Accordingly, replace the read_only_trip_count argument of intel_soc_dts_iosf_init() with a pair of new arguments related to critical trip points: a bool one indicating whether or not critical trip points are to be used at all and an int one representing the critical trip point temperature offset relative to Tj_max. Use these arguments to configure the critical trip points before the registration of the thermal zones and to compute the number of writeable trip points in add_dts_thermal_zone(). Modify both callers of intel_soc_dts_iosf_init() to take these changes into account and drop the intel_soc_dts_iosf_add_read_only_critical_trip() call, that is not necessary any more, from intel_soc_thermal_init(), which also allows it to return success right after requesting the IRQ. Finally, drop intel_soc_dts_iosf_add_read_only_critical_trip() altogether, because it does not have any more users. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-08-11thermal: intel: intel_soc_dts_iosf: Add helper for resetting trip pointsRafael J. Wysocki
Because trip points are reset for each sensor in two places in the same way, add a helper function for that to reduce code duplication a bit. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-08-11thermal: intel: intel_soc_dts_iosf: Change initialization orderingRafael J. Wysocki
The initial configuration of trip points in intel_soc_dts_iosf_init() takes place after registering the sensor thermal zones which is potentially problematic, because it may race with the setting of trip point temperatures via sysfs, as there is no synchronization between it and sys_set_trip_temp(). To address this, change the initialization ordering so that the trip points are configured prior to the registration of thermal zones. Accordingly, change the cleanup ordering in intel_soc_dts_iosf_exit() to remove the thermal zones before resetting the trip points. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-08-11thermal: intel: intel_soc_dts_iosf: Pass sensors to update_trip_temp()Rafael J. Wysocki
After previous changes, update_trip_temp() only uses its dts argument to get to the sensors field in the struct intel_soc_dts_sensor_entry object pointed to by that argument, so pass the value of that field directly to it instead. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-08-11thermal: intel: intel_soc_dts_iosf: Untangle update_trip_temp()Rafael J. Wysocki
Function update_trip_temp() is currently used for the initialization of trip points as well as for changing trip point temperatures in sys_set_trip_temp(). This is quite confusing and passing the value of dts->trip_types[trip] to it so that it can store that value in the same memory location is not particularly useful, because it only is necessary to set the trip point type once, at the initialization time. For this reason, drop the last argument from update_trip_temp() and introduce configure_trip() calling the former internally for the initial configuration of trip points. Modify the majority of update_trip_temp() callers to use configure_trip() instead of it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-08-11thermal: intel: intel_soc_dts_iosf: Always assume notification supportRafael J. Wysocki
None of the existing callers of intel_soc_dts_iosf_init() passes INTEL_SOC_DTS_INTERRUPT_NONE as the first argument to it, so the notification local variable in it is always true and the notification_support argument of add_dts_thermal_zone() is always true either. For this reason, drop the notification local variable from intel_soc_dts_iosf_init() and the notification_support argument from add_dts_thermal_zone() and rearrange the latter to always set writable_trip_cnt and trip_mask. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2023-08-11Merge tag 'pci-v6.5-fixes-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fixes from Bjorn Helgaas: - Add Manivannan Sadhasivam as DesignWare PCIe driver co-maintainer (Krzysztof Wilczyński) - Revert "PCI: dwc: Wait for link up only if link is started" to fix a regression on Qualcomm platforms that don't reach interconnect sync state if the slot is empty (Johan Hovold) - Revert "PCI: mvebu: Mark driver as BROKEN" so people can use pci-mvebu even though some others report problems (Bjorn Helgaas) - Avoid a NULL pointer dereference when using acpiphp for root bus hotplug to fix a regression added in v6.5-rc1 (Igor Mammedov) * tag 'pci-v6.5-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus Revert "PCI: mvebu: Mark driver as BROKEN" Revert "PCI: dwc: Wait for link up only if link is started" MAINTAINERS: Add Manivannan Sadhasivam as DesignWare PCIe driver maintainer
2023-08-11Merge tag 'riscv-for-linus-6.5-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - Fixes for a pair of kexec_file_load() failures - A fix to ensure the direct mapping is PMD-aligned - A fix for CPU feature detection on SMP=n - The MMIO ordering fences have been strengthened to ensure ordering WRT delay() - Fixes for a pair of -Wmissing-variable-declarations warnings - A fix to avoid PUD mappings in vmap on sv39 - flush_cache_vmap() now flushes the TLB to avoid issues on systems that cache invalid mappings * tag 'riscv-for-linus-6.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Implement flush_cache_vmap() riscv: Do not allow vmap pud mappings for 3-level page table riscv: mm: fix 2 instances of -Wmissing-variable-declarations riscv,mmio: Fix readX()-to-delay() ordering riscv: Fix CPU feature detection with SMP disabled riscv: Start of DRAM should at least be aligned on PMD size for the direct mapping riscv/kexec: load initrd high in available memory riscv/kexec: handle R_RISCV_CALL_PLT relocation type
2023-08-11Merge tag 'parisc-for-6.5-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc architecture fixes from Helge Deller: "A bugfix in the LWS code, which used different lock words than the parisc lightweight spinlock checks. This inconsistency triggered false positives when the lightweight spinlock checks checked the locks of mutexes. The other patches are trivial cleanups and most of them fix sparse warnings. Summary: - Fix LWS code to use same lock words as for the parisc lightweight spinlocks - Use PTR_ERR_OR_ZERO() in pdt init code - Fix lots of sparse warnings" * tag 'parisc-for-6.5-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: perf: Make cpu_device variable static parisc: ftrace: Add declaration for ftrace_function_trampoline() parisc: boot: Nuke some sparse warnings in decompressor parisc: processor: Include asm/smp.h for init_per_cpu() parisc: unaligned: Include linux/sysctl.h for unaligned_enabled parisc: Move proc_mckinley_root and proc_runway_root to sba_iommu parisc: dma: Add prototype for pcxl_dma_start parisc: parisc_ksyms: Include libgcc.h for libgcc prototypes parisc: ucmpdi2: Fix no previous prototype for '__ucmpdi2' warning parisc: firmware: Mark pdc_result buffers local parisc: firmware: Fix sparse context imbalance warnings parisc: signal: Fix sparse incorrect type in assignment warning parisc: ioremap: Fix sparse warnings parisc: fault: Use C99 arrary initializers parisc: pdt: Use PTR_ERR_OR_ZERO() to simplify code parisc: Fix lightweight spinlock checks to not break futexes
2023-08-11x86/uv: Update HPE Superdome Flex MaintainersJustin Ernst
Mike Travis has retired. His expertise will be sorely missed. Remove Mike's entry under SGI XP/XPC/XPNET DRIVER. Replace Mike's entry under UV HPE SUPERDOME FLEX. Signed-off-by: Justin Ernst <justin.ernst@hpe.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Joel Granados <j.granados@samsung.com> Link: https://lore.kernel.org/all/20230801155756.22308-1-justin.ernst%40hpe.com
2023-08-11Merge tag 'cpuidle-psci-v6.5-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm Pull cpuidle psci fixes from Ulf Hansson: "A couple of cpuidle-psci fixes. Usually, this is managed by arm-soc maintainers or Rafael, although due to a busy period I have stepped in to help out: - Fix the error path to prevent reverting from OSI back to PC mode" * tag 'cpuidle-psci-v6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm: cpuidle: psci: Move enabling OSI mode after power domains creation cpuidle: dt_idle_genpd: Add helper function to remove genpd topology
2023-08-11Merge tag 'drm-fixes-2023-08-11' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "This week's fixes, as expected amdgpu is probably a little larger since it skipped a week, but otherwise a few nouveau fixes, a couple of bridge, rockchip and ivpu fixes. amdgpu: - S/G display workaround for platforms with >= 64G of memory - S0i3 fix - SMU 13.0.0 fixes - Disable SMU 13.x OD features temporarily while the interface is reworked to enable additional functionality - Fix cursor gamma issues on DCN3+ - SMU 13.0.6 fixes - Fix possible UAF in CS IOCTL - Polaris display regression fix - Only enable CP GFX shadowing on SR-IOV amdkfd: - Raven/Picasso KFD regression fix bridge: - it6505: runtime PM fix - lt9611: revert Do not generate HFP/HBP/HSA and EOT packet nouveau: - enable global memory loads for helper invocations for userspace driver - dp 1.3 dpcd+ workaround fix - remove unused function - revert incorrect NULL check accel/ivpu: - Add set_pages_array_wc/uc for internal buffers rockchip: - Don't spam logs in atomic check" * tag 'drm-fixes-2023-08-11' of git://anongit.freedesktop.org/drm/drm: (23 commits) drm/shmem-helper: Reset vma->vm_ops before calling dma_buf_mmap() drm/amdkfd: disable IOMMUv2 support for Raven drm/amdkfd: disable IOMMUv2 support for KV/CZ drm/amdkfd: ignore crat by default drm/amdgpu/gfx11: only enable CP GFX shadowing on SR-IOV drm/amd/display: Fix a regression on Polaris cards drm/amdgpu: fix possible UAF in amdgpu_cs_pass1() drm/amd/pm: Fix SMU v13.0.6 energy reporting drm/amd/display: check attr flag before set cursor degamma on DCN3+ drm/amd/pm: disable the SMU13 OD feature support temporarily drm/amd/pm: correct the pcie width for smu 13.0.0 drm/amd/display: Don't show stack trace for missing eDP drm/amdgpu: Match against exact bootloader status drm/amd/pm: skip the RLC stop when S0i3 suspend for SMU v13.0.4/11 drm/amd: Disable S/G for APUs when 64GB or more host memory drm/rockchip: Don't spam logs in atomic check accel/ivpu: Add set_pages_array_wc/uc for internal buffers drm/nouveau/disp: Revert a NULL check inside nouveau_connector_get_modes Revert "drm/bridge: lt9611: Do not generate HFP/HBP/HSA and EOT packet" drm/nouveau: remove unused tu102_gr_load() function ...
2023-08-11ALSA: hda/cs8409: Support new Dell Dolphin VariantsStefan Binding
Add 4 new Dell Dolphin Systems, same configuration as older systems. Signed-off-by: Stefan Binding <sbinding@opensource.cirrus.com> Link: https://lore.kernel.org/r/20230811123044.1045651-1-sbinding@opensource.cirrus.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-08-11nvme: core: don't hold rcu read lock in nvme_ns_chr_uring_cmd_iopollMing Lei
Now nvme_ns_chr_uring_cmd_iopoll() has switched to request based io polling, and the associated NS is guaranteed to be live in case of io polling, so request is guaranteed to be valid because blk-mq uses pre-allocated request pool. Remove the rcu read lock in nvme_ns_chr_uring_cmd_iopoll(), which isn't needed any more after switching to request based io polling. Fix "BUG: sleeping function called from invalid context" because set_page_dirty_lock() from blk_rq_unmap_user() may sleep. Fixes: 585079b6e425 ("nvme: wire up async polling for io passthrough commands") Reported-by: Guangwu Zhang <guazhang@redhat.com> Cc: Kanchan Joshi <joshi.k@samsung.com> Cc: Anuj Gupta <anuj20.g@samsung.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Tested-by: Guangwu Zhang <guazhang@redhat.com> Link: https://lore.kernel.org/r/20230809020440.174682-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-08-11mm: invalidation check mapping before folio_containsHugh Dickins
Enabling tmpfs "direct IO" exposes it to invalidate_inode_pages2_range(), which when swapping can hit the VM_BUG_ON_FOLIO(!folio_contains()): the folio has been moved from page cache to swap cache (with folio->mapping reset to NULL), but the folio_index() embedded in folio_contains() sees swapcache, and so returns the swapcache_index() - whereas folio->index would be the right one to check against the index from mapping's xarray. There are different ways to fix this, but my preference is just to order the checks in invalidate_inode_pages2_range() the same way that they are in __filemap_get_folio() and find_lock_entries() and filemap_fault(): check folio->mapping before folio_contains(). Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: Jan Kara <jack@suse.cz> Message-Id: <f0b31772-78d7-f198-6482-9f25aab8c13f@google.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11tmpfs: trivial support for direct IOHugh Dickins
Depending upon your philosophical viewpoint, either tmpfs always does direct IO, or it cannot ever do direct IO; but whichever, if tmpfs is to stand in for a more sophisticated filesystem, it can be helpful for tmpfs to support O_DIRECT. So, give tmpfs a shmem_file_open() method, to set the FMODE_CAN_ODIRECT flag: then unchanged shmem_file_read_iter() and new shmem_file_write_iter() do the work (without any shmem_direct_IO() stub). Perhaps later, once the direct_IO method has been eliminated from all filesystems, generic_file_write_iter() will be such that tmpfs can again use it, even for O_DIRECT. xfstests auto generic which were not run on tmpfs before but now pass: 036 091 113 125 130 133 135 198 207 208 209 210 211 212 214 226 239 263 323 355 391 406 412 422 427 446 451 465 551 586 591 609 615 647 708 729 with no new failures. LTP dio tests which were not run on tmpfs before but now pass: dio01 through dio30, except for dio04 and dio10, which fail because tmpfs dio read and write allow odd count: tmpfs could be made stricter, but would that be an improvement? Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Message-Id: <6f2742-6f1f-cae9-7c5b-ed20fc53215@google.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11kselftest/arm64: Size sycall-abi buffers for the actual maximum VLMark Brown
Our ABI opts to provide future proofing by defining a much larger SVE_VQ_MAX than the architecture actually supports. Since we use this define to control the size of our vector data buffers this results in a lot of overhead when we initialise which can be a very noticable problem in emulation, we fill buffers that are orders of magnitude larger than we will ever actually use even with virtual platforms that provide the full range of architecturally supported vector lengths. Define and use the actual architecture maximum to mitigate this. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230810-arm64-syscall-abi-perf-v1-1-6a0d7656359c@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2023-08-11kselftest/arm64: add lse and lse2 features to hwcap testZeng Heng
Add the LSE and various features check in the set of hwcap tests. As stated in the ARM manual, the LSE2 feature allows for atomic access to unaligned memory. Therefore, for processors that only have the LSE feature, we register .sigbus_fn to test their ability to perform unaligned access. Signed-off-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230808134036.668954-6-zengheng4@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-11kselftest/arm64: add test item that support to capturing the SIGBUS signalZeng Heng
Some enhanced features, such as the LSE2 feature, do not result in SILLILL if LSE2 is missing and LSE is present, but will generate a SIGBUS exception when atomic access unaligned. Therefore, we add test item to test this type of features. Notice that testing for SIGBUS only makes sense after make sure that the instruction does not cause a SIGILL signal. Signed-off-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230808134036.668954-5-zengheng4@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-11kselftest/arm64: add DEF_SIGHANDLER_FUNC() and DEF_INST_RAISE_SIG() helpersZeng Heng
Add macro definition functions DEF_SIGHANDLER_FUNC() and DEF_INST_RAISE_SIG() helpers. Furthermore, there is no need to modify the default SIGILL handling function throughout the entire testing lifecycle in the main() function. It is reasonable to narrow the scope to the context of the sig_fn function only. This is a pre-patch for the subsequent SIGBUS handler patch. Signed-off-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230808134036.668954-4-zengheng4@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-11kselftest/arm64: add crc32 feature to hwcap testZeng Heng
Add the CRC32 feature check in the set of hwcap tests. Signed-off-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230808134036.668954-3-zengheng4@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-11kselftest/arm64: add float-point feature to hwcap testZeng Heng
Add the FP feature check in the set of hwcap tests. Signed-off-by: Zeng Heng <zengheng4@huawei.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230808134036.668954-2-zengheng4@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-11arm64: syscall: unmask DAIF earlier for SVCsMark Rutland
For a number of historical reasons, when handling SVCs we don't unmask DAIF in el0_svc() or el0_svc_compat(), and instead do so later in el0_svc_common(). This is unfortunate and makes it harder to make changes to the DAIF management in entry-common.c as we'd like to do as cleanup and preparation for FEAT_NMI support. We can move the DAIF unmasking to entry-common.c as long as we also hoist the fp_user_discard() logic, as reasoned below. We converted the syscall trace logic from assembly to C in commit: f37099b6992a0b81 ("arm64: convert syscall trace logic to C") ... which was intended to have no functional change, and mirrored the existing assembly logic to avoid the risk of any functional regression. With the logic in C, it's clear that there is currently no reason to unmask DAIF so late within el0_svc_common(): * The thread flags are read prior to unmasking DAIF, but are not consumed until after DAIF is unmasked, and we don't perform a read-modify-write sequence of the thread flags for which we might need to serialize against an IPI modifying the flags. Similarly, for any thread flags set by other threads, whether DAIF is masked or not has no impact. The read_thread_flags() helpers performs a single-copy-atomic read of the flags, and so this can safely be moved after unmasking DAIF. * The pt_regs::orig_x0 and pt_regs::syscallno fields are neither consumed nor modified by the handler for any DAIF exception (e.g. these do not exist in the `perf_event_arm_regs` enum and are not sampled by perf in its IRQ handler). Thus, the manipulation of pt_regs::orig_x0 and pt_regs::syscallno can safely be moved after unmasking DAIF. Given the above, we can safely hoist unmasking of DAIF out of el0_svc_common(), and into its immediate callers: do_el0_svc() and do_el0_svc_compat(). Further: * In do_el0_svc(), we sample the syscall number from pt_regs::regs[8]. This is not modified by the handler for any DAIF exception, and thus can safely be moved after unmasking DAIF. As fp_user_discard() operates on the live FP/SVE/SME register state, this needs to occur before we clear DAIF.IF, as interrupts could result in preemption which would cause this state to become foreign. As fp_user_discard() is the first function called within do_el0_svc(), it has no dependency on other parts of do_el0_svc() and can be moved earlier so long as it is called prior to unmasking DAIF.IF. * In do_el0_svc_compat(), we sample the syscall number from pt_regs::regs[7]. This is not modified by the handler for any DAIF exception, and thus can safely be moved after unmasking DAIF. Compat threads cannot use SVE or SME, so there's no need for el0_svc_compat() to call fp_user_discard(). Given the above, we can safely hoist the unmasking of DAIF out of do_el0_svc() and do_el0_svc_compat(), and into their immediate callers: el0_svc() and el0_svc_compat(), so long a we also hoist fp_user_discard() into el0_svc(). Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230808101148.1064172-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-11xfs use fs_holder_ops for the log and RT devicesChristoph Hellwig
Use the generic fs_holder_ops to shut down the file system when the log or RT device goes away instead of duplicating the logic. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Message-Id: <20230802154131.2221419-13-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11xfs: drop s_umount over opening the log and RT devicesChristoph Hellwig
Just like get_tree_bdev needs to drop s_umount when opening the main device, we need to do the same for the xfs log and RT devices to avoid a potential lock order reversal with s_unmount for the mark_dead path. It might be preferable to just drop s_umount over ->fill_super entirely, but that will require a fairly massive audit first, so we'll do the easy version here first. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Message-Id: <20230802154131.2221419-12-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11ext4: use fs_holder_ops for the log deviceChristoph Hellwig
Use the generic fs_holder_ops to shut down the file system when the log device goes away instead of duplicating the logic. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Message-Id: <20230802154131.2221419-11-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11ext4: drop s_umount over opening the log deviceChristoph Hellwig
Just like get_tree_bdev needs to drop s_umount when opening the main device, we need to do the same for the ext4 log device to avoid a potential lock order reversal with s_unmount for the mark_dead path. It might be preferable to just drop s_umount over ->fill_super entirely, but that will require a fairly massive audit first, so we'll do the easy version here first. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christian Brauner <brauner@kernel.org> Message-Id: <20230802154131.2221419-10-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11fs: export fs_holder_opsChristoph Hellwig
Export fs_holder_ops so that file systems that open additional block devices can use it as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christian Brauner <brauner@kernel.org> Message-Id: <20230802154131.2221419-9-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11fs: stop using get_super in fs_mark_deadChristoph Hellwig
fs_mark_dead currently uses get_super to find the superblock for the block device that is going away. This means it is limited to the main device stored in sb->s_dev, leading to a lot of code duplication for file systems that can use multiple block devices. Now that the holder for all block devices used by file systems is set to the super_block, we can instead look at that holder and then check if the file system is born and active, so do that instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christian Brauner <brauner@kernel.org> Message-Id: <20230802154131.2221419-8-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11fs: use the super_block as holder when mounting file systemsChristoph Hellwig
The file system type is not a very useful holder as it doesn't allow us to go back to the actual file system instance. Pass the super_block instead which is useful when passed back to the file system driver. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christian Brauner <brauner@kernel.org> Message-Id: <20230802154131.2221419-7-hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11net: pcs: Add missing put_device call in miic_createXiang Yang
The reference of pdev->dev is taken by of_find_device_by_node, so it should be released when not need anymore. Fixes: 7dc54d3b8d91 ("net: pcs: add Renesas MII converter driver") Signed-off-by: Xiang Yang <xiangyang3@huawei.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-11virtio-net: set queues after driver_okJason Wang
Commit 25266128fe16 ("virtio-net: fix race between set queues and probe") tries to fix the race between set queues and probe by calling _virtnet_set_queues() before DRIVER_OK is set. This violates virtio spec. Fixing this by setting queues after virtio_device_ready(). Note that rtnl needs to be held for userspace requests to change the number of queues. So we are serialized in this way. Fixes: 25266128fe16 ("virtio-net: fix race between set queues and probe") Reported-by: Dragos Tatulea <dtatulea@nvidia.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-11btrfs: convert to multigrain timestampsJeff Layton
Enable multigrain timestamps, which should ensure that there is an apparent change to the timestamp whenever it has been written after being actively observed via getattr. Beyond enabling the FS_MGTIME flag, this patch eliminates update_time_for_write, which goes to great pains to avoid in-memory stores. Just have it overwrite the timestamps unconditionally. Signed-off-by: Jeff Layton <jlayton@kernel.org> Acked-by: David Sterba <dsterba@suse.com> Reviewed-by: Jan Kara <jack@suse.cz> Message-Id: <20230807-mgctime-v7-13-d1dec143a704@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11ext4: switch to multigrain timestampsJeff Layton
Enable multigrain timestamps, which should ensure that there is an apparent change to the timestamp whenever it has been written after being actively observed via getattr. For ext4, we only need to enable the FS_MGTIME flag. Acked-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jeff Layton <jlayton@kernel.org> Message-Id: <20230807-mgctime-v7-12-d1dec143a704@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11xfs: switch to multigrain timestampsJeff Layton
Enable multigrain timestamps, which should ensure that there is an apparent change to the timestamp whenever it has been written after being actively observed via getattr. Also, anytime the mtime changes, the ctime must also change, and those are now the only two options for xfs_trans_ichgtime. Have that function unconditionally bump the ctime, and ASSERT that XFS_ICHGTIME_CHG is always set. Acked-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Jeff Layton <jlayton@kernel.org> Message-Id: <20230807-mgctime-v7-11-d1dec143a704@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11tmpfs: add support for multigrain timestampsJeff Layton
Enable multigrain timestamps, which should ensure that there is an apparent change to the timestamp whenever it has been written after being actively observed via getattr. tmpfs only requires the FS_MGTIME flag. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jeff Layton <jlayton@kernel.org> Message-Id: <20230807-mgctime-v7-10-d1dec143a704@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11fs: add infrastructure for multigrain timestampsJeff Layton
The VFS always uses coarse-grained timestamps when updating the ctime and mtime after a change. This has the benefit of allowing filesystems to optimize away a lot metadata updates, down to around 1 per jiffy, even when a file is under heavy writes. Unfortunately, this has always been an issue when we're exporting via NFSv3, which relies on timestamps to validate caches. A lot of changes can happen in a jiffy, so timestamps aren't sufficient to help the client decide to invalidate the cache. Even with NFSv4, a lot of exported filesystems don't properly support a change attribute and are subject to the same problems with timestamp granularity. Other applications have similar issues with timestamps (e.g backup applications). If we were to always use fine-grained timestamps, that would improve the situation, but that becomes rather expensive, as the underlying filesystem would have to log a lot more metadata updates. What we need is a way to only use fine-grained timestamps when they are being actively queried. POSIX generally mandates that when the the mtime changes, the ctime must also change. The kernel always stores normalized ctime values, so only the first 30 bits of the tv_nsec field are ever used. Use the 31st bit of the ctime tv_nsec field to indicate that something has queried the inode for the mtime or ctime. When this flag is set, on the next mtime or ctime update, the kernel will fetch a fine-grained timestamp instead of the usual coarse-grained one. Filesytems can opt into this behavior by setting the FS_MGTIME flag in the fstype. Filesystems that don't set this flag will continue to use coarse-grained timestamps. Later patches will convert individual filesystems to use the new infrastructure. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Message-Id: <20230807-mgctime-v7-9-d1dec143a704@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11fs: drop the timespec64 argument from update_timeJeff Layton
Now that all of the update_time operations are prepared for it, we can drop the timespec64 argument from the update_time operation. Do that and remove it from some associated functions like inode_update_time and inode_needs_update_time. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Message-Id: <20230807-mgctime-v7-8-d1dec143a704@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11xfs: have xfs_vn_update_time gets its own timestampJeff Layton
In later patches we're going to drop the "now" parameter from the update_time operation. Prepare XFS for this by reworking how it fetches timestamps and sets them in the inode. Ensure that we update the ctime even if only S_MTIME is set. Signed-off-by: Jeff Layton <jlayton@kernel.org> Acked-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Message-Id: <20230807-mgctime-v7-7-d1dec143a704@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11fat: make fat_update_time get its own timestampJeff Layton
In later patches, we're going to drop the "now" parameter from the update_time operation. Fix fat_update_time to fetch its own timestamp. It turns out that this is easily done by just passing a NULL timestamp pointer to fat_truncate_time. Signed-off-by: Jeff Layton <jlayton@kernel.org> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Frank Sorenson <sorenson@redhat.com> Message-Id: <20230810-ctime-fat-v1-2-327598fd1de8@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11fat: remove i_version handling from fat_update_timeJeff Layton
commit 6bb885ecd746 (fat: add functions to update and truncate timestamps appropriately") added an update_time routine for fat. That patch added a section for handling the S_VERSION bit, even though FAT doesn't enable SB_I_VERSION and the S_VERSION bit will never be set when calling it. Remove the section for handling S_VERSION since it's effectively dead code, and will be problematic vs. future changes. Cc: Frank Sorenson <sorenson@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Frank Sorenson <sorenson@redhat.com> Message-Id: <20230810-ctime-fat-v1-1-327598fd1de8@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11ubifs: have ubifs_update_time use inode_update_timestampsJeff Layton
In later patches, we're going to drop the "now" parameter from the update_time operation. Prepare ubifs for this, by having it use the new inode_update_timestamps helper. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Message-Id: <20230807-mgctime-v7-6-d1dec143a704@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-08-11cpufreq: amd-pstate-ut: Modify the function to get the highest_perf valueMeng Li
The previous function amd_get_highest_perf() will be deprecated. It can only return 166 or 255 by cpuinfo. For platforms that support preferred core, the value of highest perf can be between 166 and 255. Therefore, it will cause amd-pstate-ut to fail when run amd_pstate_ut_check_perf(). Signed-off-by: Meng Li <li.meng@amd.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2023-08-10gcc-plugins: Rename last_stmt() for GCC 14+Kees Cook
In GCC 14, last_stmt() was renamed to last_nondebug_stmt(). Add a helper macro to handle the renaming. Cc: linux-hardening@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
2023-08-10selftests/harness: Actually report SKIP for signal testsKees Cook
Tests that were expecting a signal were not correctly checking for a SKIP condition. Move the check before the signal checking when processing test result. Cc: Shuah Khan <shuah@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Will Drewry <wad@chromium.org> Cc: linux-kselftest@vger.kernel.org Fixes: 9847d24af95c ("selftests/harness: Refactor XFAIL into SKIP") Signed-off-by: Kees Cook <keescook@chromium.org>
2023-08-11Merge tag 'amd-drm-fixes-6.5-2023-08-09' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.5-2023-08-09: amdgpu: - S/G display workaround for platforms with >= 64G of memory - S0i3 fix - SMU 13.0.0 fixes - Disable SMU 13.x OD features temporarily while the interface is reworked to enable additional functionality - Fix cursor gamma issues on DCN3+ - SMU 13.0.6 fixes - Fix possible UAF in CS IOCTL - Polaris display regression fix - Only enable CP GFX shadowing on SR-IOV amdkfd: - Raven/Picasso KFD regression fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230809182827.8135-1-alexander.deucher@amd.com
2023-08-11erofs: boost negative xattr lookup with bloom filterJingbo Xu
Optimise the negative xattr lookup with bloom filter. The bit value for the bloom filter map has a reverse semantics for compatibility. That is, the bit value of 0 indicates existence, while the bit value of 1 indicates the absence of corresponding xattr. The initial version is _only_ enabled when xattr_filter_reserved is zero. The filter map internals may change in the future, in which case the reserved flag will be set non-zero and we don't need bothering the compatible bits again at that time. For now disable the optimization if this reserved flag is non-zero. Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230722094538.11754-3-jefflexu@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2023-08-11erofs: update on-disk format for xattr name filterJingbo Xu
The xattr name bloom filter feature is going to be introduced to speed up the negative xattr lookup, e.g. system.posix_acl_[access|default] lookup when running "ls -lR" workload. There are some commonly used extended attributes (n) and the total number of these is approximately 30. trusted.overlay.opaque trusted.overlay.redirect trusted.overlay.origin trusted.overlay.impure trusted.overlay.nlink trusted.overlay.upper trusted.overlay.metacopy trusted.overlay.protattr user.overlay.opaque user.overlay.redirect user.overlay.origin user.overlay.impure user.overlay.nlink user.overlay.upper user.overlay.metacopy user.overlay.protattr security.evm security.ima security.selinux security.SMACK64 security.SMACK64IPIN security.SMACK64IPOUT security.SMACK64EXEC security.SMACK64TRANSMUTE security.SMACK64MMAP security.apparmor security.capability system.posix_acl_access system.posix_acl_default user.mime_type Given the number of bits of the bloom filter (m) is 32, the optimal value for the number of the hash functions (k) is 1 (ln2 * m/n = 0.74). The single hash function is implemented as: xxh32(name, strlen(name), EROFS_XATTR_FILTER_SEED + index) where `index` represents the index of corresponding predefined short name prefix, while `name` represents the name string after stripping the above predefined name prefix. The constant magic number EROFS_XATTR_FILTER_SEED, i.e. 0x25BBE08F, is used to give a better spread when mapping these 30 extended attributes into 32-bit bloom filter as: bit 0: security.ima bit 1: bit 2: trusted.overlay.nlink bit 3: bit 4: user.overlay.nlink bit 5: trusted.overlay.upper bit 6: user.overlay.origin bit 7: trusted.overlay.protattr bit 8: security.apparmor bit 9: user.overlay.protattr bit 10: user.overlay.opaque bit 11: security.selinux bit 12: security.SMACK64TRANSMUTE bit 13: security.SMACK64 bit 14: security.SMACK64MMAP bit 15: user.overlay.impure bit 16: security.SMACK64IPIN bit 17: trusted.overlay.redirect bit 18: trusted.overlay.origin bit 19: security.SMACK64IPOUT bit 20: trusted.overlay.opaque bit 21: system.posix_acl_default bit 22: bit 23: user.mime_type bit 24: trusted.overlay.impure bit 25: security.SMACK64EXEC bit 26: user.overlay.redirect bit 27: user.overlay.upper bit 28: security.evm bit 29: security.capability bit 30: system.posix_acl_access bit 31: trusted.overlay.metacopy, user.overlay.metacopy h_name_filter is introduced to the on-disk per-inode xattr header to place the corresponding xattr name filter, where bit value 1 indicates non-existence for compatibility. This feature is indicated by EROFS_FEATURE_COMPAT_XATTR_FILTER compatible feature bit. Reserve one byte in on-disk superblock as the on-disk format for xattr name filter may change in the future. With this flag we don't need bothering these compatible bits again at that time. Suggested-by: Alexander Larsson <alexl@redhat.com> Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20230722094538.11754-2-jefflexu@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2023-08-11erofs: DEFLATE compression supportGao Xiang
Add DEFLATE compression as the 3rd supported algorithm. DEFLATE is a popular generic-purpose compression algorithm for quite long time (many advanced formats like gzip, zlib, zip, png are all based on that) as Apple documentation written "If you require interoperability with non-Apple devices, use COMPRESSION_ZLIB. [1]". Due to its popularity, there are several hardware on-market DEFLATE accelerators, such as (s390) DFLTCC, (Intel) IAA/QAT, (HiSilicon) ZIP accelerator, etc. In addition, there are also several high-performence IP cores and even open-source FPGA approches available for DEFLATE. Therefore, it's useful to support DEFLATE compression in order to find a way to utilize these accelerators for asynchronous I/Os and get benefits from these later. Besides, it's a good choice to trade off between compression ratios and performance compared to LZ4 and LZMA. The DEFLATE core format is simple as well as easy to understand, therefore the code size of its decompressor is small even for the bootloader use cases. The runtime memory consumption is quite limited too (e.g. 32K + ~7K for each zlib stream). As usual, EROFS ourperforms similar approaches too. Alternatively, DEFLATE could still be used for some specific files since EROFS supports multiple compression algorithms in one image. [1] https://developer.apple.com/documentation/compression/compression_algorithm Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20230810154859.118330-1-hsiangkao@linux.alibaba.com