summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2025-05-31zram: support deflate-specific paramsSergey Senozhatsky
Introduce support of algorithm specific parameters in algorithm_params device attribute. The expected format is algorithm.param=value. For starters, add support for deflate.winbits parameter. Link: https://lkml.kernel.org/r/20250514024825.1745489-3-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Reviewed-by: Mikhail Zaslonko <zaslonko@linux.ibm.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-31zram: rename ZCOMP_PARAM_NO_LEVELSergey Senozhatsky
Patch series "zram: support algorithm-specific parameters". This patchset adds support for algorithm-specific parameters. For now, only deflate-specific winbits can be configured, which fixes deflate support on some s390 setups. This patch (of 2): Use more generic name because this will be default "un-set" value for more params in the future. Link: https://lkml.kernel.org/r/20250514024825.1745489-1-senozhatsky@chromium.org Link: https://lkml.kernel.org/r/20250514024825.1745489-2-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Reviewed-by: Mikhail Zaslonko <zaslonko@linux.ibm.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-22drm/i915: track_pfn() -> "pfnmap tracking"David Hildenbrand
track_pfn() does not exist, let's simply refer to it as "pfnmap tracking". Link: https://lkml.kernel.org/r/20250512123424.637989-11-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> [x86 bits] Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Betkov <bp@alien8.de> Cc: Dave Airlie <airlied@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Jann Horn <jannh@google.com> Cc: Jonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Tvrtko Ursulin <tursulin@ursulin.net> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-22mm: numa_memblks: introduce numa_add_reserved_memblkYuquan Wang
acpi_parse_cfmws() currently adds empty CFMWS ranges to numa_meminfo with the expectation that numa_cleanup_meminfo moves them to numa_reserved_meminfo. There is no need for that indirection when it is known in advance that these unpopulated ranges are meant for numa_reserved_meminfo in support of future hotplug / CXL provisioning. Introduce and use numa_add_reserved_memblk() to add the empty CFMWS ranges directly. Link: https://lkml.kernel.org/r/20250508022719.3941335-1-wangyuquan1236@phytium.com.cn Signed-off-by: Yuquan Wang <wangyuquan1236@phytium.com.cn> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Cc: Bruno Faccini <bfaccini@nvidia.com> Cc: Chen Baozi <chenbaozi@phytium.com.cn> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Haibo Xu <haibo1.xu@intel.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Joanthan Cameron <Jonathan.Cameron@huawei.com> Cc: Len Brown <lenb@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Robert Richter <rrichter@amd.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-21mm/mempolicy: Weighted Interleave Auto-tuningJoshua Hahn
On machines with multiple memory nodes, interleaving page allocations across nodes allows for better utilization of each node's bandwidth. Previous work by Gregory Price [1] introduced weighted interleave, which allowed for pages to be allocated across nodes according to user-set ratios. Ideally, these weights should be proportional to their bandwidth, so that under bandwidth pressure, each node uses its maximal efficient bandwidth and prevents latency from increasing exponentially. Previously, weighted interleave's default weights were just 1s -- which would be equivalent to the (unweighted) interleave mempolicy, which goes through the nodes in a round-robin fashion, ignoring bandwidth information. This patch has two main goals: First, it makes weighted interleave easier to use for users who wish to relieve bandwidth pressure when using nodes with varying bandwidth (CXL). By providing a set of "real" default weights that just work out of the box, users who might not have the capability (or wish to) perform experimentation to find the most optimal weights for their system can still take advantage of bandwidth-informed weighted interleave. Second, it allows for weighted interleave to dynamically adjust to hotplugged memory with new bandwidth information. Instead of manually updating node weights every time new bandwidth information is reported or taken off, weighted interleave adjusts and provides a new set of default weights for weighted interleave to use when there is a change in bandwidth information. To meet these goals, this patch introduces an auto-configuration mode for the interleave weights that provides a reasonable set of default weights, calculated using bandwidth data reported by the system. In auto mode, weights are dynamically adjusted based on whatever the current bandwidth information reports (and responds to hotplug events). This patch still supports users manually writing weights into the nodeN sysfs interface by entering into manual mode. When a user enters manual mode, the system stops dynamically updating any of the node weights, even during hotplug events that shift the optimal weight distribution. A new sysfs interface "auto" is introduced, which allows users to switch between the auto (writing 1 or Y) and manual (writing 0 or N) modes. The system also automatically enters manual mode when a nodeN interface is manually written to. There is one functional change that this patch makes to the existing weighted_interleave ABI: previously, writing 0 directly to a nodeN interface was said to reset the weight to the system default. Before this patch, the default for all weights were 1, which meant that writing 0 and 1 were functionally equivalent. With this patch, writing 0 is invalid. Link: https://lkml.kernel.org/r/20250520141236.2987309-1-joshua.hahnjy@gmail.com [joshua.hahnjy@gmail.com: wordsmithing changes, simplification, fixes] Link: https://lkml.kernel.org/r/20250511025840.2410154-1-joshua.hahnjy@gmail.com [joshua.hahnjy@gmail.com: remove auto_kobj_attr field from struct sysfs_wi_group] Link: https://lkml.kernel.org/r/20250512142511.3959833-1-joshua.hahnjy@gmail.com https://lore.kernel.org/linux-mm/20240202170238.90004-1-gregory.price@memverge.com/ [1] Link: https://lkml.kernel.org/r/20250505182328.4148265-1-joshua.hahnjy@gmail.com Co-developed-by: Gregory Price <gourry@gourry.net> Signed-off-by: Gregory Price <gourry@gourry.net> Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com> Suggested-by: Yunjeong Mun <yunjeong.mun@sk.com> Suggested-by: Oscar Salvador <osalvador@suse.de> Suggested-by: Ying Huang <ying.huang@linux.alibaba.com> Suggested-by: Harry Yoo <harry.yoo@oracle.com> Reviewed-by: Harry Yoo <harry.yoo@oracle.com> Reviewed-by: Huang Ying <ying.huang@linux.alibaba.com> Reviewed-by: Honggyu Kim <honggyu.kim@sk.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Joanthan Cameron <Jonathan.Cameron@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Len Brown <lenb@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12DAX: warn when kmem regions are truncated for memory block alignmentGregory Price
Device capacity intended for use as system ram should be aligned to the architecture-defined memory block size or that capacity will be silently truncated and capacity stranded. As hotplug dax memory becomes more prevelant, the memory block size alignment becomes more important for platform and device vendors to pay attention to - so this truncation should not be silent. This issue is particularly relevant for CXL Dynamic Capacity devices, whose capacity may arrive in spec-aligned but block-misaligned chunks. Link: https://lkml.kernel.org/r/20250410142831.217887-1-gourry@gourry.net Suggested-by: David Hildenbrand <david@redhat.com> Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Gregory Price <gourry@gourry.net> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-12arm64: add KHO supportAlexander Graf
We now have all bits in place to support KHO kexecs. Add awareness of KHO in the kexec file as well as boot path for arm64 and adds the respective kconfig option to the architecture so that it can use KHO successfully. Changes to the "chosen" node have been sent to https://github.com/devicetree-org/dt-schema/pull/158. Link: https://lkml.kernel.org/r/20250509074635.3187114-10-changyuanl@google.com Signed-off-by: Alexander Graf <graf@amazon.com> Co-developed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Co-developed-by: Changyuan Lyu <changyuanl@google.com> Signed-off-by: Changyuan Lyu <changyuanl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Anthony Yznaga <anthony.yznaga@oracle.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Ashish Kalra <ashish.kalra@amd.com> Cc: Ben Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Betkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Gowans <jgowans@amazon.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Krzysztof Kozlowski <krzk@kernel.org> Cc: Marc Rutland <mark.rutland@arm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Pratyush Yadav <ptyadav@amazon.de> Cc: Rob Herring <robh@kernel.org> Cc: Saravana Kannan <saravanak@google.com> Cc: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Thomas Lendacky <thomas.lendacky@amd.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11zram: modernize writeback interfaceSergey Senozhatsky
The writeback interface supports a page_index=N parameter which performs writeback of the given page. Since we rarely need to writeback just one single page, the typical use case involves a number of writeback calls, each performing writeback of one page: echo page_index=100 > zram0/writeback ... echo page_index=200 > zram0/writeback echo page_index=500 > zram0/writeback ... echo page_index=700 > zram0/writeback One obvious downside of this is that it increases the number of syscalls. Less obvious, but a significantly more important downside, is that when given only one page to post-process zram cannot perform an optimal target selection. This becomes a critical limitation when writeback_limit is enabled, because under writeback_limit we want to guarantee the highest memory savings hence we first need to writeback pages that release the highest amount of zsmalloc pool memory. This patch adds page_indexes=LOW-HIGH parameter to the writeback interface: echo page_indexes=100-200 page_indexes=500-700 > zram0/writeback This gives zram a chance to apply an optimal target selection strategy on each iteration of the writeback loop. We also now permit multiple page_index parameters per call (previously zram would recognize only one page_index) and a mix or single pages and page ranges: echo page_index=42 page_index=99 page_indexes=100-200 \ page_indexes=500-700 > zram0/writeback Apart from that the patch also unifies parameters passing and resembles other "modern" zram device attributes (e.g. recompression), while the old interface used a mixed scheme: values-less parameters for mode and a key=value format for page_index. We still support the "old" value-less format for compatibility reasons. [senozhatsky@chromium.org: simplify parse_page_index() range checks, per Brian] nk: https://lkml.kernel.org/r/20250404015327.2427684-1-senozhatsky@chromium.org [sozhatsky@chromium.org: fix uninitialized variable in zram_writeback_slots(), per Dan] nk: https://lkml.kernel.org/r/20250409112611.1154282-1-senozhatsky@chromium.org Link: https://lkml.kernel.org/r/20250327015818.4148660-1-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Reviewed-by: Brian Geffon <bgeffon@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Richard Chang <richardycc@google.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11acpi,srat: give memory block size advice based on CFMWS alignmentGregory Price
Capacity is stranded when CFMWS regions are not aligned to block size. On x86, block size increases with capacity (2G blocks @ 64G capacity). Use CFMWS base/size to report memory block size alignment advice. Link: https://lkml.kernel.org/r/20250127153405.3379117-4-gourry@gourry.net Signed-off-by: Gregory Price <gourry@gourry.net> Suggested-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Fan Ni <fan.ni@samsung.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Oscar Salvador <osalvador@suse.de> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Betkov <bp@alien8.de> Cc: Bruno Faccini <bfaccini@nvidia.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haibo Xu <haibo1.xu@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joanthan Cameron <Jonathan.Cameron@huawei.com> Cc: Len Brown <lenb@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <rrichter@amd.com> Cc: Thomas Gleinxer <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11memory: implement memory_block_advise/probe_max_sizeGregory Price
Patch series "memory,x86,acpi: hotplug memory alignment advisement", v8. When physical address regions are not aligned to memory block size, the misaligned portion is lost (stranded capacity). Block size (min/max/selected) is architecture defined. Most architectures tend to use the minimum block size or some simplistic heurist. On x86, memory block size increases up to 2GB, and is otherwise fitted to the alignment of non-hotplug (i.e. not special purpose memory). CXL exposes its memory for management through the ACPI CEDT (CXL Early Detection Table) in a field called the CXL Fixed Memory Window. Per the CXL specification, this memory must be aligned to at least 256MB. When a CFMW aligns on a size less than the block size, this causes a loss of up to 2GB per CFMW on x86. It is not uncommon for CFMW to be allocated per-device - though this behavior is BIOS defined. This patch set provides 3 things: 1) implement advise/query functions in driverse/base/memory.c to report/query architecture agnostic hotplug block alignment advice. 2) update x86 memblock size logic to consider the hotplug advice 3) add code in acpi/numa/srat.c to report CFMW alignment advice The advisement interfaces are design to be called during arch_init code prior to allocator and smp_init. start_kernel will call these through setup_arch() (via acpi and mm/init_64.c on x86), which occurs prior to mm_core_init and smp_init - so no need for atomics. There's an attempt to signal callers to advise() that query has already occurred, but this is predicated on the notion that query actually occurs (which presently only happens on the x86 arch). This is to assist debugging future users. Otherwise, the advise() call has been marked __init to help static discovery of bad call times. Once query is called the first time, it will always return the same value. Interfaces return -EBUSY and 0 respectively on systems without hotplug. This patch (of 3): Hotplug memory sources may have opinions on what the memblock size should be - usually for alignment purposes. For example, CXL memory extents can be 256MB with a matching alignment. If this size/alignment is smaller than the block size, it can result in stranded capacity. Implement memory_block_advise_max_size for use prior to allocator init, for software to advise the system on the max block size. Implement memory_block_probe_max_size for use by arch init code to calculate the best block size. Use of advice is architecture defined. The probe value can never change after first probe. Calls to advise after probe will return -EBUSY to aid debugging. On systems without hotplug, always return -ENODEV and 0 respectively. Link: https://lkml.kernel.org/r/20250127153405.3379117-1-gourry@gourry.net Link: https://lkml.kernel.org/r/20250127153405.3379117-2-gourry@gourry.net Signed-off-by: Gregory Price <gourry@gourry.net> Suggested-by: Ira Weiny <ira.weiny@intel.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Oscar Salvador <osalvador@suse.de> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Betkov <bp@alien8.de> Cc: Bruno Faccini <bfaccini@nvidia.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haibo Xu <haibo1.xu@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Joanthan Cameron <Jonathan.Cameron@huawei.com> Cc: Len Brown <lenb@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Robert Richter <rrichter@amd.com> Cc: Thomas Gleinxer <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11zsmalloc: prefer the the original page's node for compressed dataNhat Pham
Currently, zsmalloc, zswap's and zram's backend memory allocator, does not enforce any policy for the allocation of memory for the compressed data, instead just adopting the memory policy of the task entering reclaim, or the default policy (prefer local node) if no such policy is specified. This can lead to several pathological behaviors in multi-node NUMA systems: 1. Systems with CXL-based memory tiering can encounter the following inversion with zswap/zram: the coldest pages demoted to the CXL tier can return to the high tier when they are reclaimed to compressed swap, creating memory pressure on the high tier. 2. Consider a direct reclaimer scanning nodes in order of allocation preference. If it ventures into remote nodes, the memory it compresses there should stay there. Trying to shift those contents over to the reclaiming thread's preferred node further *increases* its local pressure, and provoking more spills. The remote node is also the most likely to refault this data again. This undesirable behavior was pointed out by Johannes Weiner in [1]. 3. For zswap writeback, the zswap entries are organized in node-specific LRUs, based on the node placement of the original pages, allowing for targeted zswap writeback for specific nodes. However, the compressed data of a zswap entry can be placed on a different node from the LRU it is placed on. This means that reclaim targeted at one node might not free up memory used for zswap entries in that node, but instead reclaiming memory in a different node. All of these issues will be resolved if the compressed data go to the same node as the original page. This patch encourages this behavior by having zswap and zram pass the node of the original page to zsmalloc, and have zsmalloc prefer the specified node if we need to allocate new (zs)pages for the compressed data. Note that we are not strictly binding the allocation to the preferred node. We still allow the allocation to fall back to other nodes when the preferred node is full, or if we have zspages with slots available on a different node. This is OK, and still a strict improvement over the status quo: 1. On a system with demotion enabled, we will generally prefer demotions over compressed swapping, and only swap when pages have already gone to the lowest tier. This patch should achieve the desired effect for the most part. 2. If the preferred node is out of memory, letting the compressed data going to other nodes can be better than the alternative (OOMs, keeping cold memory unreclaimed, disk swapping, etc.). 3. If the allocation go to a separate node because we have a zspage with slots available, at least we're not creating extra immediate memory pressure (since the space is already allocated). 3. While there can be mixings, we generally reclaim pages in same-node batches, which encourage zspage grouping that is more likely to go to the right node. 4. A strict binding would require partitioning zsmalloc by node, which is more complicated, and more prone to regression, since it reduces the storage density of zsmalloc. We need to evaluate the tradeoff and benchmark carefully before adopting such an involved solution. [1]: https://lore.kernel.org/linux-mm/20250331165306.GC2110528@cmpxchg.org/ [senozhatsky@chromium.org: coding-style fixes] Link: https://lkml.kernel.org/r/mnvexa7kseswglcqbhlot4zg3b3la2ypv2rimdl5mh5glbmhvz@wi6bgqn47hge Link: https://lkml.kernel.org/r/20250402204416.3435994-1-nphamcs@gmail.com Signed-off-by: Nhat Pham <nphamcs@gmail.com> Suggested-by: Gregory Price <gourry@gourry.net> Acked-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev> Acked-by: Sergey Senozhatsky <senozhatsky@chromium.org> [zram, zsmalloc] Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Yosry Ahmed <yosry.ahmed@linux.dev> [zswap/zsmalloc] Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Joanthan Cameron <Jonathan.Cameron@huawei.com> Cc: Minchan Kim <minchan@kernel.org> Cc: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-11Merge tag 'timers-urgent-2025-05-11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc timers fixes from Ingo Molnar: - Fix time keeping bugs in CLOCK_MONOTONIC_COARSE clocks - Work around absolute relocations into vDSO code that GCC erroneously emits in certain arm64 build environments - Fix a false positive lockdep warning in the i8253 clocksource driver * tag 'timers-urgent-2025-05-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource/i8253: Use raw_spinlock_irqsave() in clockevent_i8253_disable() arm64: vdso: Work around invalid absolute relocations from GCC timekeeping: Prevent coarse clocks going backwards
2025-05-11Merge tag 'input-for-v6.15-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: - Synaptics touchpad on multiple laptops (Dynabook Portege X30L-G, Dynabook Portege X30-D, TUXEDO InfinityBook Pro 14 v5, Dell Precision M3800, HP Elitebook 850 G1) switched from PS/2 to SMBus mode - a number of new controllers added to xpad driver: HORI Drum controller, PowerA Fusion Pro 4, PowerA MOGA XP-Ultra controller, 8BitDo Ultimate 2 Wireless Controller, 8BitDo Ultimate 3-mode Controller, Hyperkin DuchesS Xbox One controller - fixes to xpad driver to properly handle Mad Catz JOYTECH NEO SE Advanced and PDP Mirror's Edge Official controllers - fixes to xpad driver to properly handle "Share" button on some controllers - a fix for device initialization timing and for waking up the controller in cyttsp5 driver - a fix for hisi_powerkey driver to properly wake up from s2idle state - other assorted cleanups and fixes * tag 'input-for-v6.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: xpad - fix xpad_device sorting Input: xpad - add support for several more controllers Input: xpad - fix Share button on Xbox One controllers Input: xpad - fix two controller table values Input: hisi_powerkey - enable system-wakeup for s2idle Input: synaptics - enable InterTouch on Dell Precision M3800 Input: synaptics - enable InterTouch on TUXEDO InfinityBook Pro 14 v5 Input: synaptics - enable InterTouch on Dynabook Portege X30L-G Input: synaptics - enable InterTouch on Dynabook Portege X30-D Input: synaptics - enable SMBus for HP Elitebook 850 G1 Input: mtk-pmic-keys - fix possible null pointer dereference Input: xpad - add support for 8BitDo Ultimate 2 Wireless Controller Input: cyttsp5 - fix power control issue on wakeup MAINTAINERS: .mailmap: update Mattijs Korpershoek's email address dt-bindings: mediatek,mt6779-keypad: Update Mattijs' email address Input: stmpe-ts - use module alias instead of device table Input: cyttsp5 - ensure minimum reset pulse width Input: sparcspkr - avoid unannotated fall-through input/joystick: magellan: Mark __nonstring look-up table
2025-05-10Input: xpad - fix xpad_device sortingVicki Pfau
A recent commit put one entry in the wrong place. This just moves it to the right place. Signed-off-by: Vicki Pfau <vi@endrift.com> Link: https://lore.kernel.org/r/20250328234345.989761-5-vi@endrift.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-05-10Input: xpad - add support for several more controllersVicki Pfau
This adds support for several new controllers, all of which include Share buttons: - HORI Drum controller - PowerA Fusion Pro 4 - 8BitDo Ultimate 3-mode Controller - Hyperkin DuchesS Xbox One controller - PowerA MOGA XP-Ultra controller Signed-off-by: Vicki Pfau <vi@endrift.com> Link: https://lore.kernel.org/r/20250328234345.989761-4-vi@endrift.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-05-10Input: xpad - fix Share button on Xbox One controllersVicki Pfau
The Share button, if present, is always one of two offsets from the end of the file, depending on the presence of a specific interface. As we lack parsing for the identify packet we can't automatically determine the presence of that interface, but we can hardcode which of these offsets is correct for a given controller. More controllers are probably fixable by adding the MAP_SHARE_BUTTON in the future, but for now I only added the ones that I have the ability to test directly. Signed-off-by: Vicki Pfau <vi@endrift.com> Link: https://lore.kernel.org/r/20250328234345.989761-2-vi@endrift.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-05-10Input: xpad - fix two controller table valuesVicki Pfau
Two controllers -- Mad Catz JOYTECH NEO SE Advanced and PDP Mirror's Edge Official -- were missing the value of the mapping field, and thus wouldn't detect properly. Signed-off-by: Vicki Pfau <vi@endrift.com> Link: https://lore.kernel.org/r/20250328234345.989761-1-vi@endrift.com Fixes: 540602a43ae5 ("Input: xpad - add a few new VID/PID combinations") Fixes: 3492321e2e60 ("Input: xpad - add multiple supported devices") Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-05-10Input: hisi_powerkey - enable system-wakeup for s2idleUlf Hansson
To wake up the system from s2idle when pressing the power-button, let's convert from using pm_wakeup_event() to pm_wakeup_dev_event(), as it allows us to specify the "hard" in-parameter, which needs to be set for s2idle. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20250306115021.797426-1-ulf.hansson@linaro.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2025-05-10Merge tag 'driver-core-6.15-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core Pull driver core fix from Greg KH: "Here is a single driver core fix for a regression for platform devices that is a regression from a change that went into 6.15-rc1 that affected Pixel devices. It has been in linux-next for over a week with no reported problems" * tag 'driver-core-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core: platform: Fix race condition during DMA configure at IOMMU probe time
2025-05-10Merge tag 'usb-6.15-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some small USB driver fixes for 6.15-rc6. Included in here are: - typec driver fixes - usbtmc ioctl fixes - xhci driver fixes - cdnsp driver fixes - some gadget driver fixes Nothing really major, just all little stuff that people have reported being issues. All of these have been in linux-next this week with no reported issues" * tag 'usb-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: xhci: dbc: Avoid event polling busyloop if pending rx transfers are inactive. usb: xhci: Don't trust the EP Context cycle bit when moving HW dequeue usb: usbtmc: Fix erroneous generic_read ioctl return usb: usbtmc: Fix erroneous wait_srq ioctl return usb: usbtmc: Fix erroneous get_stb ioctl error returns usb: typec: tcpm: delay SNK_TRY_WAIT_DEBOUNCE to SRC_TRYWAIT transition USB: usbtmc: use interruptible sleep in usbtmc_read usb: cdnsp: fix L1 resume issue for RTL_REVISION_NEW_LPM version usb: typec: ucsi: displayport: Fix NULL pointer access usb: typec: ucsi: displayport: Fix deadlock usb: misc: onboard_usb_dev: fix support for Cypress HX3 hubs usb: uhci-platform: Make the clock really optional usb: dwc3: gadget: Make gadget_wakeup asynchronous usb: gadget: Use get_status callback to set remote wakeup capability usb: gadget: f_ecm: Add get_status callback usb: host: tegra: Prevent host controller crash when OTG port is used usb: cdnsp: Fix issue with resuming from L1 usb: gadget: tegra-xudc: ACK ST_RC after clearing CTRL_RUN
2025-05-10Merge tag 'staging-6.15-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging driver fixes from Greg KH: "Here are three small staging driver fixes for 6.15-rc6. These are: - bcm2835-camera driver fix - two axis-fifo driver fixes All of these have been in linux-next for a few weeks with no reported issues" * tag 'staging-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: axis-fifo: Remove hardware resets for user errors staging: axis-fifo: Correct handling of tx_fifo_depth for size validation staging: bcm2835-camera: Initialise dev in v4l2_dev
2025-05-10Merge tag 'char-misc-6.15-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc/IIO driver fixes from Greg KH: "Here are a bunch of small driver fixes (mostly all IIO) for 6.15-rc6. Included in here are: - loads of tiny IIO driver fixes for reported issues - hyperv driver fix for a much-reported and worked on sysfs ring buffer creation bug All of these have been in linux-next for over a week (the IIO ones for many weeks now), with no reported issues" * tag 'char-misc-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (30 commits) Drivers: hv: Make the sysfs node size for the ring buffer dynamic uio_hv_generic: Fix sysfs creation path for ring buffer iio: adis16201: Correct inclinometer channel resolution iio: adc: ad7606: fix serial register access iio: pressure: mprls0025pa: use aligned_s64 for timestamp iio: imu: adis16550: align buffers for timestamp staging: iio: adc: ad7816: Correct conditional logic for store mode iio: adc: ad7266: Fix potential timestamp alignment issue. iio: adc: ad7768-1: Fix insufficient alignment of timestamp. iio: adc: dln2: Use aligned_s64 for timestamp iio: accel: adxl355: Make timestamp 64-bit aligned using aligned_s64 iio: temp: maxim-thermocouple: Fix potential lack of DMA safe buffer. iio: chemical: pms7003: use aligned_s64 for timestamp iio: chemical: sps30: use aligned_s64 for timestamp iio: imu: inv_mpu6050: align buffer for timestamp iio: imu: st_lsm6dsx: Fix wakeup source leaks on device unbind iio: adc: qcom-spmi-iadc: Fix wakeup source leaks on device unbind iio: accel: fxls8962af: Fix wakeup source leaks on device unbind iio: adc: ad7380: fix event threshold shift iio: hid-sensor-prox: Fix incorrect OFFSET calculation ...
2025-05-10Merge tag 'i2c-for-6.15-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: - omap: use correct function to read from device tree - MAINTAINERS: remove Seth from ISMT maintainership * tag 'i2c-for-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: MAINTAINERS: Remove entry for Seth Heasley i2c: omap: fix deprecated of_property_read_bool() use
2025-05-10Merge tag 'for-linus-6.15a-rc6-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - A fix for the xenbus driver allowing to use a PVH Dom0 with Xenstore running in another domain - A fix for the xenbus driver addressing a rare race condition resulting in NULL dereferences and other problems - A fix for the xen-swiotlb driver fixing a problem seen on Arm platforms * tag 'for-linus-6.15a-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xenbus: Use kref to track req lifetime xenbus: Allow PVH dom0 a non-local xenstore xen: swiotlb: Use swiotlb bouncing if kmalloc allocation demands it
2025-05-09Merge tag 'rust-fixes-6.15-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux Pull rust fixes from Miguel Ojeda: - Make CFI_AUTO_DEFAULT depend on !RUST or Rust >= 1.88.0 - Clean Rust (and Clippy) lints for the upcoming Rust 1.87.0 and 1.88.0 releases - Clean objtool warning for the upcoming Rust 1.87.0 release by adding one more noreturn function * tag 'rust-fixes-6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux: x86/Kconfig: make CFI_AUTO_DEFAULT depend on !RUST or Rust >= 1.88 rust: clean Rust 1.88.0's `clippy::uninlined_format_args` lint rust: clean Rust 1.88.0's warning about `clippy::disallowed_macros` configuration rust: clean Rust 1.88.0's `unnecessary_transmutes` lint rust: allow Rust 1.87.0's `clippy::ptr_eq` lint objtool/rust: add one more `noreturn` Rust function for Rust 1.87.0
2025-05-09Merge tag 'drm-fixes-2025-05-10' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes from Dave Airlie: "Weekly drm fixes, bit bigger than last week, but overall amdgpu/xe with some ivpu bits and a random few fixes, and dropping the ttm_backup struct which wrapped struct file and was recently frowned at. drm: - Fix overflow when generating wedged event ttm: - Fix documentation - Remove struct ttm_backup panel: - simple: Fix timings for AUO G101EVN010 amdgpu: - DC FP fixes - Freesync fix - DMUB AUX fixes - VCN fix - Hibernation fixes - HDP fixes xe: - Prevent PF queue overflow - Hold all forcewake during mocs test - Remove GSC flush on reset path - Fix forcewake put on error path - Fix runtime warning when building without svm i915: - Fix oops on resume after disconnecting DP MST sinks during suspend - Fix SPLC num_waiters refcounting ivpu: - Increase timeouts - Fix deadlock in cmdq ioctl - Unlock mutices in correct order v3d: - Avoid memory leak in job handling" * tag 'drm-fixes-2025-05-10' of https://gitlab.freedesktop.org/drm/kernel: (32 commits) drm/i915/dp: Fix determining SST/MST mode during MTP TU state computation drm/xe: Add config control for svm flush work drm/xe: Release force wake first then runtime power drm/xe/gsc: do not flush the GSC worker from the reset path drm/xe/tests/mocs: Hold XE_FORCEWAKE_ALL for LNCF regs drm/xe: Add page queue multiplier drm/amdgpu/hdp7: use memcfg register to post the write for HDP flush drm/amdgpu/hdp6: use memcfg register to post the write for HDP flush drm/amdgpu/hdp5.2: use memcfg register to post the write for HDP flush drm/amdgpu/hdp5: use memcfg register to post the write for HDP flush drm/amdgpu/hdp4: use memcfg register to post the write for HDP flush drm/amdgpu: fix pm notifier handling Revert "drm/amd: Stop evicting resources on APUs in suspend" drm/amdgpu/vcn: using separate VCN1_AON_SOC offset drm/amd/display: Fix wrong handling for AUX_DEFER case drm/amd/display: Copy AUX read reply data whenever length > 0 drm/amd/display: Remove incorrect checking in dmub aux handler drm/amd/display: Fix the checking condition in dmub aux handling drm/amd/display: Shift DMUB AUX reply command if necessary drm/amd/display: Call FP Protect Before Mode Programming/Mode Support ...
2025-05-10Merge tag 'drm-intel-fixes-2025-05-09' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes drm/i915 fixes for v6.15-rc6: - Fix oops on resume after disconnecting DP MST sinks during suspend - Fix SPLC num_waiters refcounting Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/87tt5umeaw.fsf@intel.com
2025-05-10Merge tag 'drm-xe-fixes-2025-05-09' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - Prevent PF queue overflow - Hold all forcewake during mocs test - Remove GSC flush on reset path - Fix forcewake put on error path - Fix runtime warning when building without svm Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/jffqa56f2zp4i5ztz677cdspgxhnw7qfop3dd3l2epykfpfvza@q2nw6wapsphz
2025-05-09Merge tag 'block-6.15-20250509' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: - Fix for a regression in this series for loop and read/write iterator handling - zone append block update tweak - remove a broken IO priority test - NVMe pull request via Christoph: - unblock ctrl state transition for firmware update (Daniel Wagner) * tag 'block-6.15-20250509' of git://git.kernel.dk/linux: block: remove test of incorrect io priority level nvme: unblock ctrl state transition for firmware update block: only update request sector if needed loop: Add sanity check for read/write_iter
2025-05-09drm/i915/dp: Fix determining SST/MST mode during MTP TU state computationImre Deak
Determining the SST/MST mode during state computation must be done based on the output type stored in the CRTC state, which in turn is set once based on the modeset connector's SST vs. MST type and will not change as long as the connector is using the CRTC. OTOH the MST mode indicated by the given connector's intel_dp::is_mst flag can change independently of the above output type, based on what sink is at any moment plugged to the connector. Fix the state computation accordingly. Cc: Jani Nikula <jani.nikula@intel.com> Fixes: f6971d7427c2 ("drm/i915/mst: adapt intel_dp_mtp_tu_compute_config() for 128b/132b SST") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4607 Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://lore.kernel.org/r/20250507151953.251846-1-imre.deak@intel.com (cherry picked from commit 0f45696ddb2b901fbf15cb8d2e89767be481d59f) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-05-09Merge tag 'amd-drm-fixes-6.15-2025-05-08' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.15-2025-05-08: amdgpu: - DC FP fixes - Freesync fix - DMUB AUX fixes - VCN fix - Hibernation fixes - HDP fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250508194102.3242372-1-alexander.deucher@amd.com
2025-05-09Merge tag 'drm-misc-fixes-2025-05-08' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: drm: - Fix overflow when generating wedged event ivpu: - Increate timeouts - Fix deadlock in cmdq ioctl - Unlock mutices in correct order panel: - simple: Fix timings for AUO G101EVN010 ttm: - Fix documentation - Remove struct ttm_backup v3d: - Avoid memory leak in job handling Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://lore.kernel.org/r/20250508104939.GA76697@2a02-2454-fd5e-fd00-c110-cbf2-6528-c5be.dyn6.pyur.net
2025-05-08drm/xe: Add config control for svm flush workShuicheng Lin
Without CONFIG_DRM_XE_GPUSVM set, GPU SVM is not initialized thus below warning pops. Refine the flush work code to be controlled by the config to avoid below warning: " [ 453.132028] ------------[ cut here ]------------ [ 453.132527] WARNING: CPU: 9 PID: 4491 at kernel/workqueue.c:4205 __flush_work+0x379/0x3a0 [ 453.133355] Modules linked in: xe drm_ttm_helper ttm gpu_sched drm_buddy drm_suballoc_helper drm_gpuvm drm_exec [ 453.134352] CPU: 9 UID: 0 PID: 4491 Comm: xe_exec_mix_mod Tainted: G U W 6.15.0-rc3+ #7 PREEMPT(full) [ 453.135405] Tainted: [U]=USER, [W]=WARN ... [ 453.136921] RIP: 0010:__flush_work+0x379/0x3a0 [ 453.137417] Code: 8b 45 00 48 8b 55 08 89 c7 48 c1 e8 04 83 e7 08 83 e0 0f 83 cf 02 89 c6 48 0f ba 6d 00 03 e9 d5 fe ff ff 0f 0b e9 db fd ff ff <0f> 0b 45 31 e4 e9 d1 fd ff ff 0f 0b e9 03 ff ff ff 0f 0b e9 d6 fe [ 453.139250] RSP: 0018:ffffc90000c67b18 EFLAGS: 00010246 [ 453.139782] RAX: 0000000000000000 RBX: ffff888108a24000 RCX: 0000000000002000 [ 453.140521] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8881016d61c8 [ 453.141253] RBP: ffff8881016d61c8 R08: 0000000000000000 R09: 0000000000000000 [ 453.141985] R10: 0000000000000000 R11: 0000000008a24000 R12: 0000000000000001 [ 453.142709] R13: 0000000000000002 R14: 0000000000000000 R15: ffff888107db8c00 [ 453.143450] FS: 00007f44853d4c80(0000) GS:ffff8882f469b000(0000) knlGS:0000000000000000 [ 453.144276] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 453.144853] CR2: 00007f4487629228 CR3: 00000001016aa000 CR4: 00000000000406f0 [ 453.145594] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 453.146320] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 453.147061] Call Trace: [ 453.147336] <TASK> [ 453.147579] ? tick_nohz_tick_stopped+0xd/0x30 [ 453.148067] ? xas_load+0x9/0xb0 [ 453.148435] ? xa_load+0x6f/0xb0 [ 453.148781] __xe_vm_bind_ioctl+0xbd5/0x1500 [xe] [ 453.149338] ? dev_printk_emit+0x48/0x70 [ 453.149762] ? _dev_printk+0x57/0x80 [ 453.150148] ? drm_ioctl+0x17c/0x440 [ 453.150544] ? __drm_dev_vprintk+0x36/0x90 [ 453.150983] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe] [ 453.151575] ? drm_ioctl_kernel+0x9f/0xf0 [ 453.151998] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe] [ 453.152560] drm_ioctl_kernel+0x9f/0xf0 [ 453.152968] drm_ioctl+0x20f/0x440 [ 453.153332] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe] [ 453.153893] ? ioctl_has_perm.constprop.0.isra.0+0xae/0x100 [ 453.154489] ? memory_bm_test_bit+0x5/0x60 [ 453.154935] xe_drm_ioctl+0x47/0x70 [xe] [ 453.155419] __x64_sys_ioctl+0x8d/0xc0 [ 453.155824] do_syscall_64+0x47/0x110 [ 453.156228] entry_SYSCALL_64_after_hwframe+0x76/0x7e " v2 (Matt): refine commit message to have more details add Fixes tag move the code to xe_svm.h which already have the config remove a blank line per codestyle suggestion Fixes: 63f6e480d115 ("drm/xe: Add SVM garbage collector") Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250502170052.1787973-1-shuicheng.lin@intel.com (cherry picked from commit 9d80698bcd97a5ad1088bcbb055e73fd068895e2) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-08drm/xe: Release force wake first then runtime powerShuicheng Lin
xe_force_wake_get() is dependent on xe_pm_runtime_get(), so for the release path, xe_force_wake_put() should be called first then xe_pm_runtime_put(). Combine the error path and normal path together with goto. Fixes: 85d547608ef5 ("drm/xe/xe_gt_debugfs: Update handling of xe_force_wake_get return") Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20250507022302.2187527-1-shuicheng.lin@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 432cd94efdca06296cc5e76d673546f58aa90ee1) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-08drm/xe/gsc: do not flush the GSC worker from the reset pathDaniele Ceraolo Spurio
The workqueue used for the reset worker is marked as WQ_MEM_RECLAIM, while the GSC one isn't (and can't be as we need to do memory allocations in the gsc worker). Therefore, we can't flush the latter from the former. The reason why we had such a flush was to avoid interrupting either the GSC FW load or in progress GSC proxy operations. GSC proxy operations fall into 2 categories: 1) GSC proxy init: this only happens once immediately after GSC FW load and does not support being interrupted. The only way to recover from an interruption of the proxy init is to do an FLR and re-load the GSC. 2) GSC proxy request: this can happen in response to a request that the driver sends to the GSC. If this is interrupted, the GSC FW will timeout and the driver request will be failed, but overall the GSC will keep working fine. Flushing the work allowed us to avoid interruption in both cases (unless the hang came from the GSC engine itself, in which case we're toast anyway). However, a failure on a proxy request is tolerable if we're in a scenario where we're triggering a GT reset (i.e., something is already gone pretty wrong), so what we really need to avoid is interrupting the init flow, which we can do by polling on the register that reports when the proxy init is complete (as that ensure us that all the load and init operations have been completed). Note that during suspend we still want to do a flush of the worker to make sure it completes any operations involving the HW before the power is cut. v2: fix spelling in commit msg, rename waiter function (Julia) Fixes: dd0e89e5edc2 ("drm/xe/gsc: GSC FW load") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4830 Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://lore.kernel.org/r/20250502155104.2201469-1-daniele.ceraolospurio@intel.com (cherry picked from commit 12370bfcc4f0bdf70279ec5b570eb298963422b5) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-08drm/xe/tests/mocs: Hold XE_FORCEWAKE_ALL for LNCF regsTejas Upadhyay
LNCF registers report wrong values when XE_FORCEWAKE_GT only is held. Holding XE_FORCEWAKE_ALL ensures correct operations on LNCF regs. V2(Himal): - Use xe_force_wake_ref_has_domain Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1999 Fixes: a6a4ea6d7d37 ("drm/xe: Add mocs kunit") Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250428082357.1730068-1-tejas.upadhyay@intel.com Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> (cherry picked from commit 70a2585e582058e94fe4381a337be42dec800337) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-08drm/xe: Add page queue multiplierMatthew Brost
For an unknown reason the math to determine the PF queue size does is not correct - compute UMD applications are overflowing the PF queue which is fatal. A multippier of 8 fixes the problem. Fixes: 3338e4f90c14 ("drm/xe: Use topology to determine page fault queue size") Cc: stable@vger.kernel.org Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jagmeet Randhawa <jagmeet.randhawa@intel.com> Link: https://lore.kernel.org/r/20250408155915.78770-1-matthew.brost@intel.com (cherry picked from commit 29582e0ea75c95668d168b12406e3c56cf5a73c4) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-05-08Merge tag 'vfio-v6.15-rc6' of https://github.com/awilliam/linux-vfioLinus Torvalds
Pull vfio fix from Alex Williamson: - Fix an issue in vfio-pci huge_fault handling by aligning faults to the order, resulting in deterministic use of huge pages. This avoids a race where simultaneous aligned and unaligned faults to the same PMD can result in a VM_FAULT_OOM and subsequent VM crash. (Alex Williamson) * tag 'vfio-v6.15-rc6' of https://github.com/awilliam/linux-vfio: vfio/pci: Align huge faults to order
2025-05-08drm/amdgpu/hdp7: use memcfg register to post the write for HDP flushAlex Deucher
Reading back the remapped HDP flush register seems to cause problems on some platforms. All we need is a read, so read back the memcfg register. Fixes: 689275140cb8 ("drm/amdgpu/hdp7.0: do a posting read when flushing HDP") Reported-by: Alexey Klimov <alexey.klimov@linaro.org> Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908 Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit dbc064adfcf9095e7d895bea87b2f75c1ab23236) Cc: stable@vger.kernel.org
2025-05-08drm/amdgpu/hdp6: use memcfg register to post the write for HDP flushAlex Deucher
Reading back the remapped HDP flush register seems to cause problems on some platforms. All we need is a read, so read back the memcfg register. Fixes: abe1cbaec6cf ("drm/amdgpu/hdp6.0: do a posting read when flushing HDP") Reported-by: Alexey Klimov <alexey.klimov@linaro.org> Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908 Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 84141ff615951359c9a99696fd79a36c465ed847) Cc: stable@vger.kernel.org
2025-05-08drm/amdgpu/hdp5.2: use memcfg register to post the write for HDP flushAlex Deucher
Reading back the remapped HDP flush register seems to cause problems on some platforms. All we need is a read, so read back the memcfg register. Fixes: f756dbac1ce1 ("drm/amdgpu/hdp5.2: do a posting read when flushing HDP") Reported-by: Alexey Klimov <alexey.klimov@linaro.org> Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908 Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 4a89b7698e771914b4d5b571600c76e2fdcbe2a9) Cc: stable@vger.kernel.org
2025-05-08drm/amdgpu/hdp5: use memcfg register to post the write for HDP flushAlex Deucher
Reading back the remapped HDP flush register seems to cause problems on some platforms. All we need is a read, so read back the memcfg register. Fixes: cf424020e040 ("drm/amdgpu/hdp5.0: do a posting read when flushing HDP") Reported-by: Alexey Klimov <alexey.klimov@linaro.org> Link: https://lists.freedesktop.org/archives/amd-gfx/2025-April/123150.html Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4119 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3908 Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a5cb344033c7598762e89255e8ff52827abb57a4) Cc: stable@vger.kernel.org
2025-05-08Merge tag 'net-6.15-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from CAN, WiFi and netfilter. We have still a comple of regressions open due to the recent drivers locking refactor. The patches are in-flight, but not ready yet. Current release - regressions: - core: lock netdevices during dev_shutdown - sch_htb: make htb_deactivate() idempotent - eth: virtio-net: don't re-enable refill work too early Current release - new code bugs: - eth: icssg-prueth: fix kernel panic during concurrent Tx queue access Previous releases - regressions: - gre: fix again IPv6 link-local address generation. - eth: b53: fix learning on VLAN unaware bridges Previous releases - always broken: - wifi: fix out-of-bounds access during multi-link element defragmentation - can: - initialize spin lock on device probe - fix order of unregistration calls - openvswitch: fix unsafe attribute parsing in output_userspace() - eth: - virtio-net: fix total qstat values - mtk_eth_soc: reset all TX queues on DMA free - fbnic: firmware IPC mailbox fixes" * tag 'net-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (55 commits) virtio-net: fix total qstat values net: export a helper for adding up queue stats fbnic: Do not allow mailbox to toggle to ready outside fbnic_mbx_poll_tx_ready fbnic: Pull fbnic_fw_xmit_cap_msg use out of interrupt context fbnic: Improve responsiveness of fbnic_mbx_poll_tx_ready fbnic: Cleanup handling of completions fbnic: Actually flush_tx instead of stalling out fbnic: Add additional handling of IRQs fbnic: Gate AXI read/write enabling on FW mailbox fbnic: Fix initialization of mailbox descriptor rings net: dsa: b53: do not set learning and unicast/multicast on up net: dsa: b53: fix learning on VLAN unaware bridges net: dsa: b53: fix toggling vlan_filtering net: dsa: b53: do not program vlans when vlan filtering is off net: dsa: b53: do not allow to configure VLAN 0 net: dsa: b53: always rejoin default untagged VLAN on bridge leave net: dsa: b53: fix VLAN ID for untagged vlan on bridge leave net: dsa: b53: fix flushing old pvid VLAN on pvid change net: dsa: b53: fix clearing PVID of a port net: dsa: b53: keep CPU port always tagged again ...
2025-05-08Merge tag 's390-6.15-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Heiko Carstens: - Fix potential use-after-free bug and missing error handling in PCI code - Fix dcssblk build error - Fix last breaking event handling in case of stack corruption to allow for better error reporting - Update defconfigs * tag 's390-6.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/pci: Fix duplicate pci_dev_put() in disable_slot() when PF has child VFs s390/pci: Fix missing check for zpci_create_device() error return s390: Update defconfigs s390/dcssblk: Fix build error with CONFIG_DAX=m and CONFIG_DCSSBLK=y s390/entry: Fix last breaking event handling in case of stack corruption s390/configs: Enable options required for TC flow offload s390/configs: Enable VDPA on Nvidia ConnectX-6 network card
2025-05-08virtio-net: fix total qstat valuesJakub Kicinski
NIPA tests report that the interface statistics reported via qstat are lower than those reported via ip link. Looks like this is because some tests flip the queue count up and down, and we end up with some of the traffic accounted on disabled queues. Add up counters from disabled queues. Fixes: d888f04c09bb ("virtio-net: support queue stat") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250507003221.823267-3-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08fbnic: Do not allow mailbox to toggle to ready outside fbnic_mbx_poll_tx_readyAlexander Duyck
We had originally thought to have the mailbox go to ready in the background while we were doing other things. One issue with this though is that we can't disable it by clearing the ready state without also blocking interrupts or calls to mbx_poll as it will just pop back to life during an interrupt. In order to prevent that from happening we can pull the code for toggling to ready out of the interrupt path and instead place it in the fbnic_mbx_poll_tx_ready path so that it becomes the only spot where the Rx/Tx can toggle to the ready state. By doing this we can prevent races where we disable the DMA and/or free buffers only to have an interrupt fire and undo what we have done. Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/174654722518.499179.11612865740376848478.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08fbnic: Pull fbnic_fw_xmit_cap_msg use out of interrupt contextAlexander Duyck
This change pulls the call to fbnic_fw_xmit_cap_msg out of fbnic_mbx_init_desc_ring and instead places it in the polling function for getting the Tx ready. Doing that we can avoid the potential issue with an interrupt coming in later from the firmware that causes it to get fired in interrupt context. Fixes: 20d2e88cc746 ("eth: fbnic: Add initial messaging to notify FW of our presence") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/174654721876.499179.9839651602256668493.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08fbnic: Improve responsiveness of fbnic_mbx_poll_tx_readyAlexander Duyck
There were a couple different issues found in fbnic_mbx_poll_tx_ready. Among them were the fact that we were sleeping much longer than we actually needed to as the actual FW could respond in under 20ms. The other issue was that we would just keep polling the mailbox even if the device itself had gone away. To address the responsiveness issues we can decrease the sleeps to 20ms and use a jiffies based timeout value rather than just counting the number of times we slept and then polled. To address the hardware going away we can move the check for the firmware BAR being present from where it was and place it inside the loop after the mailbox descriptor ring is initialized and before we sleep so that we just abort and return an error if the device went away during initialization. With these two changes we see a significant improvement in boot times for the driver. Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/174654721224.499179.2698616208976624755.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08fbnic: Cleanup handling of completionsAlexander Duyck
There was an issue in that if we were to shutdown we could be left with a completion in flight as the mailbox went away. To address that I have added an fbnic_mbx_evict_all_cmpl function that is meant to essentially create a "broken pipe" type response so that all callers will receive an error indicating that the connection has been broken as a result of us shutting down the mailbox. Fixes: 378e5cc1c6c6 ("eth: fbnic: hwmon: Add completion infrastructure for firmware requests") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/174654720578.499179.380252598204530873.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08fbnic: Actually flush_tx instead of stalling outAlexander Duyck
The fbnic_mbx_flush_tx function had a number of issues. First, we were waiting 200ms for the firmware to process the packets. We can drop this to 20ms and in almost all cases this should be more than enough time. So by changing this we can significantly reduce shutdown time. Second, we were not making sure that the Tx path was actually shut off. As such we could still have packets added while we were flushing the mailbox. To prevent that we can now clear the ready flag for the Tx side and it should stay down since the interrupt is disabled. Third, we kept re-reading the tail due to the second issue. The tail should not move after we have started the flush so we can just read it once while we are holding the mailbox Tx lock. By doing that we are guaranteed that the value should be consistent. Fourth, we were keeping a count of descriptors cleaned due to the second and third issues called out. That count is not a valid reason to be exiting the cleanup, and with the tail only being read once we shouldn't see any cases where the tail moves after the disable so the tracking of count can be dropped. Fifth, we were using attempts * sleep time to determine how long we would wait in our polling loop to flush out the Tx. This can be very imprecise. In order to tighten up the timing we are shifting over to using a jiffies value of jiffies + 10 * HZ + 1 to determine the jiffies value we should stop polling at as this should be accurate within once sleep cycle for the total amount of time spent polling. Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/174654719929.499179.16406653096197423749.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>