summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-01-14i2c: amd756: Remove superfluous TODOAtharva Tiwari
This old driver has never been used on big-endian systems, so remove the todo. Suggested-by: Andi Shyti <andi.shyti@kernel.org> Signed-off-by: Atharva Tiwari <evepolonium@gmail.com> [wsa: reworded commit message] Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2025-01-14Revert "i2c: amd756: Fix endianness handling for word data"Wolfram Sang
This reverts commit 70f3d3669c074efbcee32867a1ab71f5f7ead385. We concluded that removing the comments is the right thing to do. This will be done by an incremental patch. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2025-01-14Merge branch 'i2c/i2c-host' of ↵Wolfram Sang
git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-mergewindow Andi is unavailable for some time. So, I take over his work for this mergewindow.
2025-01-14udp: Make rehash4 independent in udp_lib_rehash()Philo Lu
As discussed in [0], rehash4 could be missed in udp_lib_rehash() when udp hash4 changes while hash2 doesn't change. This patch fixes this by moving rehash4 codes out of rehash2 checking, and then rehash2 and rehash4 are done separately. By doing this, we no longer need to call rehash4 explicitly in udp_lib_hash4(), as the rehash callback in __ip4_datagram_connect takes it. Thus, now udp_lib_hash4() returns directly if the sk is already hashed. Note that uhash4 may fail to work under consecutive connect(<dst address>) calls because rehash() is not called with every connect(). To overcome this, connect(<AF_UNSPEC>) needs to be called after the next connect to a new destination. [0] https://lore.kernel.org/all/4761e466ab9f7542c68cdc95f248987d127044d2.1733499715.git.pabeni@redhat.com/ Fixes: 78c91ae2c6de ("ipv4/udp: Add 4-tuple hash for connected socket") Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Philo Lu <lulie@linux.alibaba.com> Link: https://patch.msgid.link/20250110010810.107145-1-lulie@linux.alibaba.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-14USB: serial: quatech2: fix null-ptr-deref in qt2_process_read_urb()Qasim Ijaz
This patch addresses a null-ptr-deref in qt2_process_read_urb() due to an incorrect bounds check in the following: if (newport > serial->num_ports) { dev_err(&port->dev, "%s - port change to invalid port: %i\n", __func__, newport); break; } The condition doesn't account for the valid range of the serial->port buffer, which is from 0 to serial->num_ports - 1. When newport is equal to serial->num_ports, the assignment of "port" in the following code is out-of-bounds and NULL: serial_priv->current_port = newport; port = serial->port[serial_priv->current_port]; The fix checks if newport is greater than or equal to serial->num_ports indicating it is out-of-bounds. Reported-by: syzbot <syzbot+506479ebf12fe435d01a@syzkaller.appspotmail.com> Closes: https://syzkaller.appspot.com/bug?extid=506479ebf12fe435d01a Fixes: f7a33e608d9a ("USB: serial: add quatech2 usb to serial driver") Cc: <stable@vger.kernel.org> # 3.5 Signed-off-by: Qasim Ijaz <qasdev00@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Johan Hovold <johan@kernel.org>
2025-01-14net: sched: calls synchronize_net() only when neededEric Dumazet
dev_deactivate_many() role is to remove the qdiscs of a network device. When/if a qdisc is dismantled, an rcu grace period is needed to make sure all outstanding qdisc enqueue are done before we proceed with a qdisc reset. Most virtual devices do not have a qdisc. We can call the expensive synchronize_net() only if needed. Note that dev_deactivate_many() does not have to deal with qdisc-less dev_queue_xmit, as an old comment was claiming. Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250109171850.2871194-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-14RDMA/bnxt_re: Add support to handle DCB_CONFIG_CHANGE eventKalesh AP
QP1 context in HW needs to be updated when there is a change in the default DSCP values used for RoCE traffic. Handle the event from FW and modify the dscp value used by QP1. Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/20250107024553.2926983-5-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-01-14RDMA/bnxt_re: Query firmware defaults of CC params during probeKalesh AP
Added function to query firmware default values of CC parameters during driver init. These values will be stored in driver local structure and used in subsequent patch. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://patch.msgid.link/20250107024553.2926983-4-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-01-14RDMA/bnxt_re: Add Async event handling supportKalesh AP
Using the option provided by Ethernet driver, register for FW Async event. During probe, while registeriung with Ethernet driver, provide the ulp hook 'ulp_async_notifier' for receiving the firmware events. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://patch.msgid.link/20250107024553.2926983-3-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-01-14ALSA: hda: Add AZX_DCAPS_NO_TCSEL flag for Loongson HDA devicesQunqin Zhao
Loongson's HDA devices do not support TCSEL functionality. Signed-off-by: Qunqin Zhao <zhaoqunqin@loongson.cn> Link: https://patch.msgid.link/20250114080700.23029-1-zhaoqunqin@loongson.cn Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-01-14bnxt_en: Add ULP call to notify async eventsMichael Chan
When the driver receives an async event notification from the Firmware, we make the new ulp_async_notifier() call to inform the RDMA driver that a firmware async event has been received. RDMA driver can then take necessary actions based on the event type. In the next patch, we will implement the ulp_async_notifier() callbacks in the RDMA driver. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://patch.msgid.link/20250107024553.2926983-2-kalesh-anakkur.purayil@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-01-14HID: intel-thc-hid: fix build errors in um modeEven Xu
Add dependency to ACPI to avoid acpi APIs missing in um mode. Signed-off-by: Even Xu <even.xu@intel.com> Suggested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202501131826.sX2DubPG-lkp@intel.com Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-01-14pinctrl: renesas: rzg2l: Fix PFC_MASK for RZ/V2H and RZ/G3ELad Prabhakar
The PFC_MASK value for the PFC_mx registers is currently hardcoded to 0x07, which is correct for SoCs in the RZ/G2L family, but insufficient for RZ/V2H and RZ/G3E, where the mask value should be 0x0f. This discrepancy causes incorrect PFC register configuration on RZ/V2H and RZ/G3E SoCs. On RZ/G2L, the PFC_mx bitfields are also 4 bits wide, with bit 4 marked as reserved. The reserved bits are documented to read as zero and be ignored when written. Updating the PFC_MASK definition from 0x07 to 0x0f ensures compatibility with both SoC families while maintaining correct behavior on RZ/G2L. Fixes: 9bd95ac86e70 ("pinctrl: renesas: rzg2l: Add support for RZ/V2H SoC") Cc: stable@vger.kernel.org Reported-by: Hien Huynh <hien.huynh.px@renesas.com> Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/20250110221045.594596-1-prabhakar.mahadev-lad.rj@bp.renesas.com Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
2025-01-14drm/i915/fb: Relax clear color alignment to 64 bytesVille Syrjälä
Mesa changed its clear color alignment from 4k to 64 bytes without informing the kernel side about the change. This is now likely to cause framebuffer creation to fail. The only thing we do with the clear color buffer in i915 is: 1. map a single page 2. read out bytes 16-23 from said page 3. unmap the page So the only requirement we really have is that those 8 bytes are all contained within one page. Thus we can deal with the Mesa regression by reducing the alignment requiment from 4k to the same 64 bytes in the kernel. We could even go as low as 32 bytes, but IIRC 64 bytes is the hardware requirement on the 3D engine side so matching that seems sensible. Note that the Mesa alignment chages were partially undone so the regression itself was already fixed on userspace side. Cc: stable@vger.kernel.org Cc: Sagar Ghuge <sagar.ghuge@intel.com> Cc: Nanley Chery <nanley.g.chery@intel.com> Reported-by: Xi Ruoyao <xry111@xry111.site> Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13057 Closes: https://lore.kernel.org/all/45a5bba8de009347262d86a4acb27169d9ae0d9f.camel@xry111.site/ Link: https://gitlab.freedesktop.org/mesa/mesa/-/commit/17f97a69c13832a6c1b0b3aad45b06f07d4b852f Link: https://gitlab.freedesktop.org/mesa/mesa/-/commit/888f63cf1baf34bc95e847a30a041dc7798edddb Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241129065014.8363-2-ville.syrjala@linux.intel.com Tested-by: Xi Ruoyao <xry111@xry111.site> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> (cherry picked from commit ed3a892e5e3d6b3f6eeb76db7c92a968aeb52f3d) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
2025-01-14efi: sysfb_efi: fix W=1 warnings when EFI is not setRandy Dunlap
A build with W=1 fails because there are code and data that are not needed or used when CONFIG_EFI is not set. Move the "#ifdef CONFIG_EFI" block to earlier in the source file so that the unused code/data are not built. drivers/firmware/efi/sysfb_efi.c:345:39: warning: ‘efifb_fwnode_ops’ defined but not used [-Wunused-const-variable=] 345 | static const struct fwnode_operations efifb_fwnode_ops = { | ^~~~~~~~~~~~~~~~ drivers/firmware/efi/sysfb_efi.c:238:35: warning: ‘efifb_dmi_swap_width_height’ defined but not used [-Wunused-const-variable=] 238 | static const struct dmi_system_id efifb_dmi_swap_width_height[] __initconst = { | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/firmware/efi/sysfb_efi.c:188:35: warning: ‘efifb_dmi_system_table’ defined but not used [-Wunused-const-variable=] 188 | static const struct dmi_system_id efifb_dmi_system_table[] __initconst = { | ^~~~~~~~~~~~~~~~~~~~~~ Fixes: 15d27b15de96 ("efi: sysfb_efi: fix build when EFI is not set") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202501071933.20nlmJJt-lkp@intel.com/ Cc: David Rheinsberg <david@readahead.eu> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Javier Martinez Canillas <javierm@redhat.com> Cc: Peter Jones <pjones@redhat.com> Cc: Simona Vetter <simona@ffwll.ch> Cc: linux-fbdev@vger.kernel.org Cc: Ard Biesheuvel <ardb@kernel.org> Cc: linux-efi@vger.kernel.org Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-01-14efi/libstub: Use __free() helper for pool deallocationsArd Biesheuvel
Annotate some local buffer allocations as __free(efi_pool) and simplify the associated error handling accordingly. This removes a couple of gotos and simplifies the code. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-01-14efi/libstub: Use cleanup helpers for freeing copies of the memory mapArd Biesheuvel
The EFI stub may obtain the memory map from the firmware numerous times, and this involves doing a EFI pool allocation first, which needs to be freed after use. Streamline this using a cleanup helper, which makes the code easier to follow. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-01-14efi/libstub: Simplify PCI I/O handle buffer traversalArd Biesheuvel
Use LocateHandleBuffer() and a __free() cleanup helper to simplify the PCI I/O handle buffer traversal code. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-01-14efi/libstub: Refactor and clean up GOP resolution picker codeArd Biesheuvel
The EFI stub implements various ways of setting the resolution of the EFI framebuffer at boot, and this duplicates a lot of boilerplate for iterating over the supported modes and extracting the resolution and color depth. Refactor this into a single helper that takes a callback, and use it for the 'auto', 'list' and 'res' selection methods. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-01-14efi/libstub: Simplify GOP handling codeArd Biesheuvel
Use the LocateHandleBuffer() API and a __free() function to simplify the logic that allocates a handle buffer to iterate over all GOP protocols in the EFI database. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-01-14efi/libstub: Use C99-style for loop to traverse handle bufferArd Biesheuvel
Tweak the for_each_efi_handle() macro in order to avoid the need on the part of the caller to provide a loop counter variable. Also move efi_get_handle_num() to the callers, so that each occurrence can be replaced with the actual number returned by the simplified LocateHandleBuffer API. Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-01-14x86/efistub: Drop long obsolete UGA supportArd Biesheuvel
UGA is the EFI graphical output protocol that preceded GOP, and has been long obsolete. Drop support for it from the x86 implementation of the EFI stub - other architectures never bothered to implement it (save for ia64) Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-01-13mm/early_ioremap: add null pointer checks to prevent NULL-pointer dereferenceGuo Weikang
The early_ioremap interface can fail and return NULL in certain cases. To prevent NULL-pointer dereference crashes, fixed issues in the acpi_extlog and copy_early_mem interfaces, improving robustness when handling early memory. Link: https://lkml.kernel.org/r/20241212101004.1544070-1-guoweikang.kernel@gmail.com Signed-off-by: Guo Weikang <guoweikang.kernel@gmail.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Baoquan He <bhe@redhat.com> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Julian Stecklina <julian.stecklina@cyberus-technology.de> Cc: Kevin Loughlin <kevinloughlin@google.com> Cc: Len Brown <lenb@kernel.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Xin Li (Intel) <xin@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13mm: add comments to do_mmap(), mmap_region() and vm_mmap()Lorenzo Stoakes
It isn't always entirely clear to users the difference between do_mmap(), mmap_region() and vm_mmap(), so add comments to clarify what's going on in each. This is compounded by the fact that we actually allow callers external to mm to invoke both do_mmap() and mmap_region() (!), the latter of which is really strictly speaking an internal memory mapping implementation detail. Link: https://lkml.kernel.org/r/20241212113152.28849-1-lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Jann Horn <jannh@google.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13mm: assert mmap write lock held on do_mmap(), mmap_region()Lorenzo Stoakes
Both of these functions can be invoked outside of mm, so it is probably a good idea to assert that the required lock is held. Will only have an impact if CONFIG_DEBUG_VM is set, otherwise this amounts to no change at all. Link: https://lkml.kernel.org/r/20241212114841.55185-1-lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13MAINTAINERS: update MEMORY MAPPING sectionLorenzo Stoakes
Update the MEMORY MAPPING section to contain VMA logic as it makes no sense to have these two sections separate. Additionally, add files which permit changes to the attributes and/or ranges spanned by memory mappings, in essence anything which might alter the output of /proc/$pid/[s]maps. This is necessarily fuzzy, as there is not quite as good separation of concerns as we would ideally like in the kernel. However each of these files interacts with the VMA and memory mapping logic in such a way as to be inseparatable from it, and it is important that they are maintained in conjunction with it. Link: https://lkml.kernel.org/r/20241211105315.21756-1-lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13memcg/hugetlb: remove memcg hugetlb try-commit-cancel protocolJoshua Hahn
This patch fully removes the mem_cgroup_{try, commit, cancel}_charge functions, as well as their hugetlb variants. Link: https://lkml.kernel.org/r/20241211203951.764733-4-joshua.hahnjy@gmail.com Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Nhat Pham <nphamcs@gmail.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13memcg/hugetlb: introduce mem_cgroup_charge_hugetlbJoshua Hahn
This patch introduces mem_cgroup_charge_hugetlb which combines the logic of mem_cgroup_hugetlb_try_charge / mem_cgroup_hugetlb_commit_charge and removes the need for mem_cgroup_hugetlb_cancel_charge. It also reduces the footprint of memcg in hugetlb code and consolidates all memcg related error paths into one. Link: https://lkml.kernel.org/r/20241211203951.764733-3-joshua.hahnjy@gmail.com Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Nhat Pham <nphamcs@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Roman Gushchin <roman.gushchin@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13memcg/hugetlb: introduce memcg_accounts_hugetlbJoshua Hahn
Patch series "memcg/hugetlb: Rework memcg hugetlb charging", v3. This series cleans up memcg's hugetlb charging logic by deprecating the current memcg hugetlb try-charge + {commit, cancel} logic present in alloc_hugetlb_folio. A single function mem_cgroup_charge_hugetlb takes its place instead. This makes the code more maintainable by simplifying the error path and reduces memcg's footprint in hugetlb logic. This patch introduces a few changes in the hugetlb folio allocation error path: (a) Instead of having multiple return points, we consolidate them to two: one for reaching the memcg limit or running out of memory (-ENOMEM) and one for hugetlb allocation fails / limit being reached (-ENOSPC). (b) Previously, the memcg limit was checked before the folio is acquired, meaning the hugeTLB folio isn't acquired if the limit is reached. This patch performs the charging after the folio is reached, meaning if memcg's limit is reached, the acquired folio is freed right away. This patch builds on two earlier patch series: [2] which adds memcg hugeTLB counters, and [3] which deprecates charge moving and removes the last references to mem_cgroup_cancel_charge. The request for this cleanup can be found in [2]. [1] https://lore.kernel.org/all/20231006184629.155543-1-nphamcs@gmail.com/ [2] https://lore.kernel.org/all/20241101204402.1885383-1-joshua.hahnjy@gmail.com/ [3] https://lore.kernel.org/linux-mm/20241025012304.2473312-1-shakeel.butt@linux.dev/ This patch (of 3): This patch isolates the check for whether memcg accounts hugetlb. This condition can only be true if the memcg mount option memory_hugetlb_accounting is on, which includes hugetlb usage in memory.current. Link: https://lkml.kernel.org/r/20241211203951.764733-1-joshua.hahnjy@gmail.com Link: https://lkml.kernel.org/r/20241211203951.764733-2-joshua.hahnjy@gmail.com Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Nhat Pham <nphamcs@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13mm/migrate: remove slab checks in isolate_movable_page()Hyeonggon Yoo
Commit 8b8817630ae8 ("mm/migrate: make isolate_movable_page() skip slab pages") introduced slab checks to prevent mis-identification of slab pages as movable kernel pages. However, after Matthew's frozen folio series, these slab checks became unnecessary as the migration logic fails to increase the reference count for frozen slab folios. Remove these redundant slab checks and associated memory barriers. Link: https://lkml.kernel.org/r/20241210124807.8584-1-42.hyeyoo@gmail.com Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: David Rientjes <rientjes@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Roman Gushchin <roman.gushchin@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13samples/damon/prcl: implement schemes setupSeongJae Park
Implement a proactive cold memory regions reclaiming logic of prcl sample module using DAMOS. The logic treats memory regions that not accessed at all for five or more seconds as cold, and reclaim those as soon as found. Link: https://lkml.kernel.org/r/20241210215030.85675-6-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13samples/damon: introduce a skeleton of a smaple DAMON module for proactive ↵SeongJae Park
reclamation DAMON is not only for monitoring of access patterns, but also for access-aware system operations. For the system operations, DAMON provides a feature called DAMOS (Data Access Monitoring-based Operation Schemes). There is no sample API usage of DAMOS, though. Copy the working set size estimation sample modules with changed names of the module and symbols, to use it as a skeleton for a sample module showing the DAMOS API usage. The following commit will make it proactively reclaim cold memory of the given process, using DAMOS. Link: https://lkml.kernel.org/r/20241210215030.85675-5-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13samples/damon/wsse: implement working set size estimation and loggingSeongJae Park
Implement the DAMON-based working set size estimation logic. The logic iterates memory regions in DAMON-generated access pattern snapshot for every aggregation interval and get the total sum of the size of any region having one or higher 'nr_accesses' count. That is, it assumes any region having one or higher 'nr_accesses' to be a part of the working set. The estimated value is reported to the user by printing it to the kernel log. Link: https://lkml.kernel.org/r/20241210215030.85675-4-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13samples/damon/wsse: start and stop DAMON as the user requestsSeongJae Park
Start running DAMON to monitor accesses of a process that the user specified via 'target_pid' parameter, when 'y' is passed to 'enable' parameter. Stop running DAMON when 'n' is passed to 'enable' parameter. Estimating the working set size from DAMON's monitoring results and reporting it to the user will be implemented by the following commit. Link: https://lkml.kernel.org/r/20241210215030.85675-3-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13samples: add a skeleton of a sample DAMON module for working set size estimationSeongJae Park
Patch series "mm/damon: add sample modules". Implement a proactive cold memory regions reclaiming logic of prcl sample module using DAMOS. The logic treats memory regions that not accessed at all for five or more seconds as cold, and reclaim those as soon as found. This patch (of 5): Add a skeleton for a sample DAMON static module that can be used for estimating working set size of a given process. Note that it is a static module since DAMON is not exporting symbols to loadable modules for now. It exposes two module parameters, namely 'pid' and 'enable'. 'pid' will specify the process that the module will estimate the working set size of. 'enable' will receive whether to start or stop the estimation. Because this is just a skeleton, the parameters do nothing, though. The functionalities will be implemented by following commits. Link: https://lkml.kernel.org/r/20241210215030.85675-1-sj@kernel.org Link: https://lkml.kernel.org/r/20241210215030.85675-2-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13selftests/mm: remove X permission from sigaltstack mappingKevin Brodsky
There is no reason why the alternate signal stack should be mapped as RWX. Map it as RW instead. Link: https://lkml.kernel.org/r/20241209095019.1732120-15-kevin.brodsky@arm.com Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Keith Lucas <keith.lucas@oracle.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13selftests/mm: skip pkey_sighandler_tests if support is missingKevin Brodsky
The pkey_sighandler_tests are bound to fail if either the kernel or CPU doesn't support pkeys. Skip the tests if pkeys support is missing. Link: https://lkml.kernel.org/r/20241209095019.1732120-14-kevin.brodsky@arm.com Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Keith Lucas <keith.lucas@oracle.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13selftests/mm: rename pkey register macroKevin Brodsky
PKEY_ALLOW_ALL is meant to represent the pkey register value that allows all accesses (enables all pkeys). However its current naming suggests that the value applies to *one* key only (like PKEY_DISABLE_ACCESS for instance). Rename PKEY_ALLOW_ALL to PKEY_REG_ALLOW_ALL to avoid such misunderstanding. This is consistent with the PKEY_REG_ALLOW_NONE macro introduced by commit 6e182dc9f268 ("selftests/mm: Use generic pkey register manipulation"). Link: https://lkml.kernel.org/r/20241209095019.1732120-13-kevin.brodsky@arm.com Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Keith Lucas <keith.lucas@oracle.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13selftests/mm: use sys_pkey helpers consistentlyKevin Brodsky
sys_pkey_alloc, sys_pkey_free and sys_mprotect_pkey are currently used in protections_keys.c, while pkey_sighandler_tests.c calls the libc wrappers directly (e.g. pkey_mprotect()). This is probably ok when using glibc (those symbols appeared a while ago), but Musl does not currently provide them. The logging in the helpers from pkey-helpers.h can also come in handy. Make things more consistent by using the sys_pkey helpers in pkey_sighandler_tests.c too. To that end their implementation is moved to a common .c file (pkey_util.c). This also enables calling is_pkeys_supported() outside of protections_keys.c, since it relies on sys_pkey_{alloc,free}. [kevin.brodsky@arm.com: fix dependency on pkey_util.c] Link: https://lkml.kernel.org/r/20241216092849.2140850-1-kevin.brodsky@arm.com Link: https://lkml.kernel.org/r/20241209095019.1732120-12-kevin.brodsky@arm.com Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Keith Lucas <keith.lucas@oracle.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13selftests/mm: ensure non-global pkey symbols are marked staticKevin Brodsky
The pkey tests define a whole lot of functions and some global variables. A few are truly global (declared in pkey-helpers.h), but the majority are file-scoped. Make sure those are labelled static. Some of the pkey_{access,write}_{allow,deny} helpers are not called, or only called when building for some architectures. Mark them __maybe_unused to suppress compiler warnings. Link: https://lkml.kernel.org/r/20241209095019.1732120-11-kevin.brodsky@arm.com Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Keith Lucas <keith.lucas@oracle.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13selftests/mm: remove empty pkey helper definitionKevin Brodsky
Some of the functions declared in pkey-helpers.h are actually defined in protections_keys.c, meaning they can only be called from protections_keys.c. This is less than ideal, but it is hard to avoid as these helpers are themselves called from inline functions in pkey-<arch>.h. Let's at least add a comment clarifying that. We can also remove the empty definition in pkey_sighandler_tests.c: expected_pkey_fault() is not meant to be called from there. Link: https://lkml.kernel.org/r/20241209095019.1732120-10-kevin.brodsky@arm.com Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Keith Lucas <keith.lucas@oracle.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13selftests/mm: ensure pkey-*.h define inline functions onlyKevin Brodsky
Headers should not define non-inline functions, as this prevents them from being included more than once in a given program. pkey-helpers.h and the arch-specific headers it includes currently define multiple such non-inline functions. In most cases those functions can simply be made inline - this patch does just that. read_ptr() is an exception as it must not be inlined. Since it is only called from protection_keys.c, we just move it there. Link: https://lkml.kernel.org/r/20241209095019.1732120-9-kevin.brodsky@arm.com Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Keith Lucas <keith.lucas@oracle.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13selftests/mm: define types using typedef in pkey-helpers.hKevin Brodsky
Using #define to define types should be avoided. Use typedef instead. Also ensure that __u* types are actually defined by including <linux/types.h>. Link: https://lkml.kernel.org/r/20241209095019.1732120-8-kevin.brodsky@arm.com Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Keith Lucas <keith.lucas@oracle.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13selftests/mm: remove unused pkey helpersKevin Brodsky
Commit 5f23f6d082a9 ("x86/pkeys: Add self-tests") introduced a number of helpers and functions that don't seem to have ever been used. Let's remove them. Link: https://lkml.kernel.org/r/20241209095019.1732120-7-kevin.brodsky@arm.com Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Keith Lucas <keith.lucas@oracle.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13selftests/mm: build with -O2Kevin Brodsky
The mm kselftests are currently built with no optimisation (-O0). It's unclear why, and besides being obviously suboptimal, this also prevents the pkeys tests from working as intended. Let's build all the tests with -O2. [kevin.brodsky@arm.com: silence unused-result warnings] Link: https://lkml.kernel.org/r/20250107170110.2819685-1-kevin.brodsky@arm.com Link: https://lkml.kernel.org/r/20241209095019.1732120-6-kevin.brodsky@arm.com Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Keith Lucas <keith.lucas@oracle.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13selftests/mm: fix -Warray-bounds warnings in pkey_sighandler_testsKevin Brodsky
GCC doesn't like dereferencing a pointer set to 0x1 (when building at -O2): pkey_sighandler_tests.c:166:9: warning: array subscript 0 is outside array bounds of 'int[0]' [-Warray-bounds=] 166 | *(int *) (0x1) = 1; | ^~~~~~~~~~~~~~ cc1: note: source object is likely at address zero Using NULL instead seems to make it happy. This should make no difference in practice (SIGSEGV with SEGV_MAPERR will be the outcome regardless), we just need to update the expected si_addr. [kevin.brodsky@arm.com: fix clang dereferencing-null issue] Link: https://lkml.kernel.org/r/20241218153615.2267571-1-kevin.brodsky@arm.com Link: https://lkml.kernel.org/r/20241209095019.1732120-5-kevin.brodsky@arm.com Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Keith Lucas <keith.lucas@oracle.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: kernel test robot <lkp@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13selftests/mm: fix strncpy() lengthKevin Brodsky
GCC complains (with -O2) that the length is equal to the destination size, which is indeed invalid. Subtract 1 from the size of the array to leave room for '\0'. Link: https://lkml.kernel.org/r/20241209095019.1732120-4-kevin.brodsky@arm.com Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Keith Lucas <keith.lucas@oracle.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13selftests/mm: fix -Wmaybe-uninitialized warningsKevin Brodsky
A few -Wmaybe-uninitialized warnings show up when building the mm tests with -O2. None of them looks worrying; silence them by initialising the problematic variables. Link: https://lkml.kernel.org/r/20241209095019.1732120-3-kevin.brodsky@arm.com Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Keith Lucas <keith.lucas@oracle.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13selftests/mm: fix condition in uffd_move_test_common()Kevin Brodsky
Patch series "pkeys kselftests improvements". This series brings various cleanups and fixes for the mm (mostly pkeys) kselftests. The original goal was to make the pkeys tests work out of the box and without build warning - it turned out to be more involved than expected. The most important change is enabling -O2 when building all mm kselftests (patch 5). This is actually needed for the pkeys tests to run successfully (see gcc command line at the top of protection_keys.c and pkey_sighandler_tests.c), and seems to have no negative impact on the other tests. It certainly can't hurt performance! The following patches address a few obvious issues in the pkeys tests (unused code, bad scope for functions/variables, etc.) and finally make a couple of small improvements. There is one ugliness that this series does not fix: some functions in pkey-<arch>.h call functions that are actually defined in protection_keys.c. For instance, expect_fault_on_read_execonly_key() in pkey-x86.h calls expected_pkey_fault(). This means that other test programs that use pkey-helpers.h (namely pkey_sighandler_tests) would fail to link if they called such functions defined in pkey-<arch>.h. Fixing this would require a more comprehensive reorganisation of the pkey-* headers, which doesn't seem worth it (patch 9 adds a comment to pkey-helpers.h to clarify the situation). Some more details on the patches: - Patch 1 is an unrelated fix that was revealed by inspecting a warning. It seems fairly harmless though, so I thought I'd just post it as part of this series. - Patch 2-5 fix various warnings that come up by building the mm tests at -O2 and finally enable -O2. - Patch 6-12 are various cleanups for the pkeys tests. Patch 11 in particular enables is_pkeys_supported() to be called from outside protection_keys.c (patch 13 relies on this). - Patch 13-14 are small improvements to pkey_sighandler_tests.c. Many thanks to Ryan Roberts for checking that the mm tests still run fine on arm64 with those patches applied. I've also checked that the pkeys tests run fine on arm64 and x86. This patch (of 14): area_src and area_dst are saved at the beginning of the function if chunk_size > page_size. The intention is quite clearly to restore them at the end based on the same condition, but step_size is considered instead of chunk_size. Considering that step_size is a number of pages, the condition is likely to be false. Use the same condition as when saving so that the globals are restored as intended. Link: https://lkml.kernel.org/r/20241209095019.1732120-1-kevin.brodsky@arm.com Link: https://lkml.kernel.org/r/20241209095019.1732120-2-kevin.brodsky@arm.com Fixes: a2bf6a9ca805 ("selftests/mm: add UFFDIO_MOVE ioctl test") Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Aruna Ramakrishna <aruna.ramakrishna@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Keith Lucas <keith.lucas@oracle.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-13mm/memory_hotplug: don't use __GFP_HARDWALL when migrating pages via memory ↵David Hildenbrand
offlining We'll migrate pages allocated by other context; respecting the cpuset of the memory offlining context when allocating a migration target does not make sense. Drop the __GFP_HARDWALL by using GFP_KERNEL. Note that in an ideal world, migration code could figure out the cpuset of the original context and take that into consideration. Link: https://lkml.kernel.org/r/20241205090508.2095225-3-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Suggested-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Oscar Salvador <osalvador@suse.de> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>