Age | Commit message (Collapse) | Author |
|
Make sure to drop the reference to the dwc3 device taken by
of_find_device_by_node() on probe errors and on driver unbind.
Fixes: 6dd2565989b4 ("usb: dwc3: add imx8mp dwc3 glue layer driver")
Cc: stable@vger.kernel.org # 5.12
Cc: Li Jun <jun.li@nxp.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20250724091910.21092-2-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Nothing seems to prevent the driver from being compile tested on
non-OMAP platforms so enable it to provide wider build coverage.
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20250724092104.21317-1-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Since commit f3323cd03e58 ("usb: gadget: udc: renesas_usb3: remove R-Car
H3 ES1.* handling") the driver only supports OF probe so drop the unused
platform module alias.
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20250724092006.21216-1-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
When encounters some errors like these:
xhci_hcd 0000:4a:00.2: xHCI dying or halted, can't queue_command
xhci_hcd 0000:4a:00.2: FIXME: allocate a command ring segment
usb usb5-port6: couldn't allocate usb_device
It's hard to know whether xhc_state is dying or halted. So it's better
to print xhc_state's value which can help locate the resaon of the bug.
Signed-off-by: Su Hui <suhui@nfschina.com>
Link: https://lore.kernel.org/r/20250725060117.1773770-1-suhui@nfschina.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
In case of ovl_lookup_temp() failure, we currently print `err`
which is actually not initialized at all.
Instead, properly print PTR_ERR(whiteout) which is where the
actual error really is.
Address-Coverity-ID: 1647983 ("Uninitialized variables (UNINIT)")
Fixes: 8afa0a7367138 ("ovl: narrow locking in ovl_whiteout()")
Signed-off-by: Antonio Quartulli <antonio@mandelbit.com>
Link: https://lore.kernel.org/20250721203821.7812-1-antonio@mandelbit.com
Reviewed-by: NeilBrown <neil@brown.name>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Quote from the virtio specification chapter 4.2.2.2:
"For the device-specific configuration space, the driver MUST use 8 bit
wide accesses for 8 bit wide fields, 16 bit wide and aligned accesses
for 16 bit wide fields and 32 bit wide and aligned accesses for 32 and
64 bit wide fields."
Signed-off-by: Harald Mommer <harald.mommer@oss.qualcomm.com>
Cc: stable@vger.kernel.org
Fixes: 3a29355a22c0 ("gpio: Add virtio-gpio driver")
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lore.kernel.org/r/20250724143718.5442-2-harald.mommer@oss.qualcomm.com
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
If STATX_BASIC_STATS flags are not given as an argument to vfs_getattr,
It can not get ctime and mtime in kstat.
This causes a problem showing mtime and ctime outdated from cifs.ko.
File: /xfstest.test/foo
Size: 4096 Blocks: 8 IO Block: 1048576 regular file
Device: 0,65 Inode: 2033391 Links: 1
Access: (0755/-rwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Context: system_u:object_r:cifs_t:s0
Access: 2025-07-23 22:15:30.136051900 +0100
Modify: 1970-01-01 01:00:00.000000000 +0100
Change: 1970-01-01 01:00:00.000000000 +0100
Birth: 2025-07-23 22:15:30.136051900 +0100
Cc: stable@vger.kernel.org
Reported-by: David Howells <dhowells@redhat.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
If client send multiple session setup requests to ksmbd,
Preauh_HashValue race condition could happen.
There is no need to free sess->Preauh_HashValue at session setup phase.
It can be freed together with session at connection termination phase.
Cc: stable@vger.kernel.org
Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-27661
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
xa_store() may fail so check its return value and return error code if
error occurred.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
If client send two session setups with krb5 authenticate to ksmbd,
null pointer dereference error in generate_encryptionkey could happen.
sess->Preauth_HashValue is set to NULL if session is valid.
So this patch skip generate encryption key if session is valid.
Cc: stable@vger.kernel.org
Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-27654
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
This fixes an infinite loop when repairing "extent past end of inode",
when the extent is an older snapshot than the inode that needs repair.
Without the snaphsots_seen_add_inorder() we keep trying to delete the
same extent, even though it's no longer visible in the inode's snapshot.
Fixes: 63d6e9311999 ("bcachefs: bch2_fpunch_snapshot()")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
When flushing the btree write buffer, we pull write buffer keys directly
from the journal instead of letting the journal write path copy them to
the write buffer.
When flushing from the currently open journal buffer, we have to block
new reservations and wait for outstanding reservations to complete.
Recheck the reservation state after blocking new reservations:
previously, we were checking the reservation count from before calling
__journal_block().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a UFS host controller interface (UFSHCI) node to mt8195.dtsi.
Introduce the 'mediatek,ufs-disable-mcq' property to allow disabling
Multiple Circular Queue (MCQ) support.
Signed-off-by: Rice Lee <ot_riceyj.lee@mediatek.com>
Signed-off-by: Eric Lin <ht.lin@mediatek.com>
Signed-off-by: Macpaul Lin <macpaul.lin@mediatek.com>
Link: https://lore.kernel.org/r/20250722085721.2062657-4-macpaul.lin@mediatek.com
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Add MT8195 UFSHCI compatible string. Relax the schema to allow between
one to eight clocks/clock-names entries for all MediaTek UFS
nodes. Legacy platforms may only need a few clocks, whereas newer devices
such as the MT8195 require additional clock-gating domains. For MT8195
specifically, enforce exactly eight clocks and clock-names entries to
satisfy its hardware requirements.
Signed-off-by: Macpaul Lin <macpaul.lin@mediatek.com>
Link: https://lore.kernel.org/r/20250722085721.2062657-3-macpaul.lin@mediatek.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Add the 'mediatek,ufs-disable-mcq' property to the UFS device-tree
bindings. This flag corresponds to the UFS_MTK_CAP_DISABLE_MCQ host
capability recently introduced in the UFS host driver, allowing it to
disable the Multiple Circular Queue (MCQ) feature when present. The
binding schema has also been updated to resolve DTBS check errors.
Cc: stable@vger.kernel.org
Fixes: 46bd3e31d74b ("scsi: ufs: mediatek: Add UFS_MTK_CAP_DISABLE_MCQ")
Signed-off-by: Macpaul Lin <macpaul.lin@mediatek.com>
Link: https://lore.kernel.org/r/20250722085721.2062657-2-macpaul.lin@mediatek.com
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Add "mediatek,mt8195-ufshci" to the of_device_id table to enable support
for MediaTek MT8195/MT8395 UFS host controller. This matches the device
node entry in the MT8195/MT8395 device tree and allows proper driver
binding.
Signed-off-by: Macpaul Lin <macpaul.lin@mediatek.com>
Link: https://lore.kernel.org/r/20250722085721.2062657-1-macpaul.lin@mediatek.com
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Intel MTL-like host controllers"
Adrian Hunter <adrian.hunter@intel.com> says:
Hi
Here is V2 of a couple of fixes for Intel MTL-like UFS host controllers,
related to link Hibernation state.
Following the fixes are some improvements for the enabling and disabling
of UIC Completion interrupts.
Link: https://lore.kernel.org/r/20250723165856.145750-1-adrian.hunter@intel.com
Conflicts:
drivers/ufs/core/ufshcd.c
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Now that UFS core enables the UIC Completion interrupt only when needed,
Intel MTL driver no longer needs to control the interrupt itself. So
remove the associated code.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20250723165856.145750-9-adrian.hunter@intel.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Write a new value to the interrupt enable register only if it is
different from the old value, thereby saving a register write operation.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20250723165856.145750-8-adrian.hunter@intel.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Currently the UIC Completion interrupt is left enabled except for when
issuing link hibernate commands, in which case the interrupt is disabled
and then re-enabled.
Instead, set and clear the interrupt enable bit as needed.
That is slightly simpler and less error prone, but also avoids side
effects of accessing the interrupt enable register after entering link
hibernation. Specifically, for some host controllers like Intel MTL,
doing so disrupts the link state transition.
Note also, the interrupt register is not read back anymore after it is
updated. No other code does that, so it is assumed to be no longer
necessary if it ever was.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20250723165856.145750-7-adrian.hunter@intel.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Make ufshcd_send_bsg_uic_cmd() call ufshcd_send_uic_cmd() instead of
duplicating its code.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20250723165856.145750-6-adrian.hunter@intel.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Move ufshcd_enable_intr() and ufshcd_disable_intr() so they can be called
in subsequent patches without forward declarations.
No functional change.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20250723165856.145750-5-adrian.hunter@intel.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
->late_init() was introduced to allow the default values for rpm_lvl and
spm_lvl to be set. Since commit bb9850704c04 ("scsi: ufs: core: Honor
runtime/system PM levels if set by host controller drivers") and commit
fe06b7c07f3f ("scsi: ufs: core: Set default runtime/system PM levels
before ufshcd_hba_init()"), those default values can be set in the
->init() variant call back.
Move the setting of default values for rpm_lvl and spm_lvl to ->init()
and remove ->late_init().
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20250723165856.145750-4-adrian.hunter@intel.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Intel MTL-like host controllers support auto-hibernate. Using
auto-hibernate with manual (driver initiated) hibernate produces more
complex operation. For example, the host controller will have to exit
auto-hibernate simply to allow the driver to enter hibernate state
manually. That is not recommended.
The default rpm_lvl and spm_lvl is 3, which includes manual hibernate.
Change the default values to 2, which does not.
Note, to be simpler to backport to stable kernels, utilize the UFS PCI
driver's ->late_init() call back. Recent commits have made it possible
to set up a controller-specific default in the regular ->init() call
back, but not all stable kernels have those changes.
Fixes: 4049f7acef3e ("scsi: ufs: ufs-pci: Add support for Intel MTL")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20250723165856.145750-3-adrian.hunter@intel.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
controllers
UFSHCD core disables the UIC completion interrupt when issuing UIC
hibernation commands, and re-enables it afterwards if it was enabled to
start with, refer ufshcd_uic_pwr_ctrl(). For Intel MTL-like host
controllers, accessing the register to re-enable the interrupt disrupts
the state transition.
Use hibern8_notify variant operation to disable the interrupt during the
entire hibernation, thereby preventing the disruption.
Fixes: 4049f7acef3e ("scsi: ufs: ufs-pci: Add support for Intel MTL")
Cc: stable@vger.kernel.org
Signed-off-by: Archana Patni <archana.patni@intel.com>
Link: https://lore.kernel.org/r/20250723165856.145750-2-adrian.hunter@intel.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Return 0 instead of -EINVAL if function ccu_pll_recalc_rate() fails to
get correct rate entry. Follow .recalc_rate callback documentation
as mentioned in include/linux/clk-provider.h for error return value.
Signed-off-by: Akhilesh Patil <akhilesh@ee.iitb.ac.in>
Fixes: 1b72c59db0add ("clk: spacemit: Add clock support for SpacemiT K1 SoC")
Reviewed-by: Haylen Chu <heylenay@4d2.org>
Reviewed-by: Alex Elder <elder@riscstar.com>
Link: https://lore.kernel.org/r/aIBzVClNQOBrjIFG@bhairav-test.ee.iitb.ac.in
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
MediaTek platforms"
peter.wang@mediatek.com says:
This series fixes some defects and provide features in MediaTek UFS drivers.
Link: https://lore.kernel.org/r/20250722030841.1998783-1-peter.wang@mediatek.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Add support for scaling the FDE (AES) clock to achieve higher
performance, particularly for HS-G5:
1. Parse DTS settings for FDE min/max mux.
2. Scale up the FDE clock when required for enhanced performance.
These changes ensure that the FDE clock can be dynamically adjusted based
on performance needs, leveraging DTS configurations.
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20250722030841.1998783-10-peter.wang@mediatek.com
Reviewed-by: Chun-Hung Wu <chun-hung.wu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Add support for clock scaling with Vcore binding:
1. Parse the DTS setting for Vcore voltage.
2. Set the Vcore voltage to the DTS-specified value before scaling up.
3. Reset the Vcore voltage to the default setting after scaling down.
These changes ensure that the Vcore voltage is appropriately managed
during clock scaling operations to maintain system stability and
performance.
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20250722030841.1998783-9-peter.wang@mediatek.com
Reviewed-by: Chun-Hung Wu <chun-hung.wu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Introduce a clock scaling readiness query function to streamline the
process of checking clock scaling parameters. This function simplifies
the code by encapsulating the logic for determining if clock scaling is
ready.
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20250722030841.1998783-8-peter.wang@mediatek.com
Reviewed-by: Chun-Hung Wu <chun-hung.wu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Introduce a function for version control to distinguish between new and
old platforms. Update the handling of hardware IP versions, ensuring
correct version comparisons by adjusting the version format for specific
projects.
Signed-off-by: Alice Chao <alice.chao@mediatek.com>
Link: https://lore.kernel.org/r/20250722030841.1998783-7-peter.wang@mediatek.com
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Reviewed-by: Chun-Hung Wu <chun-hung.wu@mediatek.com>
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Set the IRQ affinity for MCQ mode to improve performance. Specifically,
it migrates the IRQ from CPU0 to CPU3 to enhance IRQ handling efficiency.
Setting IRQ affinity directly from the kernel allows the configuration to
take effect earlier, and provides greater security and consistency,
especially important for systems with strict performanceor real-time
requirements.
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20250722030841.1998783-6-peter.wang@mediatek.com
Reviewed-by: Chun-Hung Wu <chun-hung.wu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Introduce a mechanism to handle broken RTC by checking the DTS
setting. The configuration is specifically required for legacy platform.
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20250722030841.1998783-5-peter.wang@mediatek.com
Reviewed-by: Chun-Hung Wu <chun-hung.wu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Update the timeout policy for ref-clk control.
- If a clock-on operation times out, it is assumed that the clock is
off. The system will notify TFA to perform clock-off settings.
- If a clock-off operation times out, it is assumed that the clock will
eventually turn off. The 'ref_clk_enabled' flag is set directly.
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20250722030841.1998783-4-peter.wang@mediatek.com
Reviewed-by: Chun-Hung Wu <chun-hung.wu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
On MT6989 and later platforms, control of DDR_EN has been switched from
SPM to EMI. To prevent abnormal access to DRAM, it is necessary to wait
for 'ddren_ack' or assert 'ddren_urgent' after sending 'ddren_req'.
Introduce the DDR_EN configuration in the UFS initialization flow,
utilizing the assertion of 'ddren_urgent' to maintain performance.
Signed-off-by: Naomi Chu <naomi.chu@mediatek.com>
Link: https://lore.kernel.org/r/20250722030841.1998783-3-peter.wang@mediatek.com
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Reviewed-by: Chun-Hung Wu <chun-hung.wu@mediatek.com>
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Simplify the conversion from unsigned int to boolean by removing explicit
conversions and parentheses, relying on implicit conversion instead. This
change ensures consistency with other usages in ufs-mediatek.c and
streamlines the code.
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20250722030841.1998783-2-peter.wang@mediatek.com
Reviewed-by: Chun-Hung Wu <chun-hung.wu@mediatek.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"11 hotfixes. 9 are cc:stable and the remainder address post-6.15
issues or aren't considered necessary for -stable kernels.
7 are for MM"
* tag 'mm-hotfixes-stable-2025-07-24-18-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
sprintf.h requires stdarg.h
resource: fix false warning in __request_region()
mm/damon/core: commit damos_quota_goal->nid
kasan: use vmalloc_dump_obj() for vmalloc error reports
mm/ksm: fix -Wsometimes-uninitialized from clang-21 in advisor_mode_show()
mm: update MAINTAINERS entry for HMM
nilfs2: reject invalid file types when reading inodes
selftests/mm: fix split_huge_page_test for folio_split() tests
mailmap: add entry for Senozhatsky
mm/zsmalloc: do not pass __GFP_MOVABLE if CONFIG_COMPACTION=n
mm/vmscan: fix hwpoisoned large folio handling in shrink_folio_list
|
|
All callers have been converted to use filemap_grab_folio().
Link: https://lkml.kernel.org/r/20250721204619.163883-1-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
damon_migrate_pages() tries migration even if the target node is invalid.
If users mistakenly make such invalid requests via
DAMOS_MIGRATE_{HOT,COLD} action, the below kernel BUG can happen.
[ 7831.883495] BUG: unable to handle page fault for address: 0000000000001f48
[ 7831.884160] #PF: supervisor read access in kernel mode
[ 7831.884681] #PF: error_code(0x0000) - not-present page
[ 7831.885203] PGD 0 P4D 0
[ 7831.885468] Oops: Oops: 0000 [#1] SMP PTI
[ 7831.885852] CPU: 31 UID: 0 PID: 94202 Comm: kdamond.0 Not tainted 6.16.0-rc5-mm-new-damon+ #93 PREEMPT(voluntary)
[ 7831.886913] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-4.el9 04/01/2014
[ 7831.887777] RIP: 0010:__alloc_frozen_pages_noprof (include/linux/mmzone.h:1724 include/linux/mmzone.h:1750 mm/page_alloc.c:4936 mm/page_alloc.c:5137)
[...]
[ 7831.895953] Call Trace:
[ 7831.896195] <TASK>
[ 7831.896397] __folio_alloc_noprof (mm/page_alloc.c:5183 mm/page_alloc.c:5192)
[ 7831.896787] migrate_pages_batch (mm/migrate.c:1189 mm/migrate.c:1851)
[ 7831.897228] ? __pfx_alloc_migration_target (mm/migrate.c:2137)
[ 7831.897735] migrate_pages (mm/migrate.c:2078)
[ 7831.898141] ? __pfx_alloc_migration_target (mm/migrate.c:2137)
[ 7831.898664] damon_migrate_folio_list (mm/damon/ops-common.c:321 mm/damon/ops-common.c:354)
[ 7831.899140] damon_migrate_pages (mm/damon/ops-common.c:405)
[...]
Add a target node validity check in damon_migrate_pages(). The validity
check is stolen from that of do_pages_move(), which is being used for the
move_pages() system call.
Link: https://lkml.kernel.org/r/20250720185822.1451-1-sj@kernel.org
Fixes: b51820ebea65 ("mm/damon/paddr: introduce DAMOS_MIGRATE_COLD action for demotion") [6.11.x]
Signed-off-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Honggyu Kim <honggyu.kim@sk.com>
Cc: Hyeongtak Ji <hyeongtak.ji@sk.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Rather confusingly, setting all Transparent Huge Page sysfs settings to
"never" does not in fact result in THP being globally disabled.
Rather, it results in khugepaged being disabled, but one can still obtain
THP pages using madvise(..., MADV_COLLAPSE).
This is something that has remained poorly documented for some time, and
it is likely the received wisdom of most users of THP that never does, in
fact, mean never.
It is therefore important to highlight, very clearly, that this is not the
case.
[lorenzo.stoakes@oracle.com: update transhuge page to mention MADV_COLLAPSE]
Link: https://lkml.kernel.org/r/d54d1dfb-f06d-4979-983b-73998f05867e@lucifer.local
Link: https://lkml.kernel.org/r/20250721155530.75944-1-lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Barry Song <baohua@kernel.org>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Check that moving a range of VMAs where we are offset into the first and
last VMAs works correctly.
This results in the VMAs being split at these points at which we are offset
into VMAs.
We explicitly test both the ordinary MREMAP_FIXED multi VMA move case and
the MREMAP_DONTUNMAP multi VMA move case.
Link: https://lkml.kernel.org/r/b04920bb6c09dc86c207c251eab8ec670fbbcaef.1753119043.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
We support MREMAP_MAYMOVE | MREMAP_FIXED | MREMAP_DONTUNMAP for moving
multiple VMAs via mremap(), so assert that the tests pass with both
MREMAP_DONTUNMAP set and not set.
Additionally, add success = false settings when mremap() fails. This is
something that cannot realistically happen, so in no way impacted test
outcome, but it is incorrect to indicate a test pass when something has
failed.
Link: https://lkml.kernel.org/r/d7359941981e4e44c774753b3e364d1c54928e6a.1753119043.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "tools/testing: expand mremap testing".
Expand our mremap() testing to further assert that behaviour is as
expected.
There is a poorly documented mremap() feature whereby it is possible to
mremap() multiple VMAs (even with gaps) when shrinking, as long as the
resultant shrunk range spans only a single VMA.
So we start by asserting this behaviour functions correctly both with an
in-place shrink and a shrink/move.
Next, we further test the newly introduced ability to mremap() multiple
VMAs when performing a MAP_FIXED move (that is without the size being
changed), firstly by asserting that MREMAP_DONTUNMAP has no bearing on
this behaviour.
Finally, we explicitly test that such moves, when splitting source VMAs,
function correctly.
This patch (of 3):
There is an apparently little-known feature of mremap() whereby, in stark
contrast to other modes (other than the recently introduced capacity to
move multiple VMAs), the input source range span multiple VMAs with gaps
between.
This is, when shrinking a VMA, whether moving it or not, and the shrink
would reduce the range to a single VMA - this is permitted, as the shrink
is actioned by an unmap.
This patch adds tests to assert that this behaves as expected.
Link: https://lkml.kernel.org/r/cover.1753119043.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/f08122893a26092a2bec6e69443e87f468ffdbed.1753119043.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
To ensure only the current test is skipped on permission failure, instead
of terminating the entire test binary.
Link: https://lkml.kernel.org/r/20250717131857.59909-3-lianux.mm@gmail.com
Signed-off-by: wang lian <lianux.mm@gmail.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <ryncsn@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "selftests/mm: reuse FORCE_READ to replace "asm volatile("" :
"+r" (XXX));" and some cleanup", v2.
This series introduces a common FORCE_READ() macro to replace the cryptic
asm volatile("" : "+r" (variable)); construct used in several mm
selftests. This improves code readability and maintainability by removing
duplicated, hard-to-understand code.
This patch (of 2):
Several mm selftests use the `asm volatile("" : "+r" (variable));`
construct to force a read of a variable, preventing the compiler from
optimizing away the memory access. This idiom is cryptic and duplicated
across multiple test files.
Following a suggestion from David[1], this patch refactors this common
pattern into a FORCE_READ() macro
Link: https://lkml.kernel.org/r/20250717131857.59909-1-lianux.mm@gmail.com
Link: https://lkml.kernel.org/r/20250717131857.59909-2-lianux.mm@gmail.com
Link: https://lore.kernel.org/lkml/4a3e0759-caa1-4cfa-bc3f-402593f1eee3@redhat.com/ [1]
Signed-off-by: wang lian <lianux.mm@gmail.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Kairui Song <ryncsn@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Override the generic definition of modify_prot_start_ptes() to use
get_and_clear_full_ptes(). This helper does a TLBI only for the starting
and ending contpte block of the range, whereas the current implementation
will call ptep_get_and_clear() for every contpte block, thus doing a TLBI
on every contpte block. Therefore, we have a performance win.
The arm64 definition of pte_accessible() allows us to batch in the
errata specific case:
#define pte_accessible(mm, pte) \
(mm_tlb_flush_pending(mm) ? pte_present(pte) : pte_valid(pte))
All ptes are obviously present in the folio batch, and they are also valid.
Override the generic definition of modify_prot_commit_ptes() to simply use
set_ptes() to map the new ptes into the pagetable.
Link: https://lkml.kernel.org/r/20250718090244.21092-8-dev.jain@arm.com
Signed-off-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <yang@os.amperecomputing.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Zhenhua Huang <quic_zhenhuah@quicinc.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Use folio_pte_batch to batch process a large folio. Note that, PTE
batching here will save a few function calls, and this strategy in certain
cases (not this one) batches atomic operations in general, so we have a
performance win for all arches. This patch paves the way for patch 7
which will help us elide the TLBI per contig block on arm64.
The correctness of this patch lies on the correctness of setting the new
ptes based upon information only from the first pte of the batch (which
may also have accumulated a/d bits via modify_prot_start_ptes()).
Observe that the flag combination we pass to mprotect_folio_pte_batch()
guarantees that the batch is uniform w.r.t the soft-dirty bit and the
writable bit. Therefore, the only bits which may differ are the a/d bits.
So we only need to worry about code which is concerned about the a/d bits
of the PTEs.
Setting extra a/d bits on the new ptes where previously they were not set,
is fine - setting access bit when it was not set is not an incorrectness
problem but will only possibly delay the reclaim of the page mapped by the
pte (which is in fact intended because the kernel just operated on this
region via mprotect()!). Setting dirty bit when it was not set is again
not an incorrectness problem but will only possibly force an unnecessary
writeback.
So now we need to reason whether something can go wrong via
can_change_pte_writable(). The pte_protnone, pte_needs_soft_dirty_wp, and
userfaultfd_pte_wp cases are solved due to uniformity in the corresponding
bits guaranteed by the flag combination. The ptes all belong to the same
VMA (since callers guarantee that [start, end) will lie within the VMA)
therefore the conditional based on the VMA is also safe to batch around.
Since the dirty bit on the PTE really is just an indication that the folio
got written to - even if the PTE is not actually dirty but one of the PTEs
in the batch is, the wp-fault optimization can be made. Therefore, it is
safe to batch around pte_dirty() in can_change_shared_pte_writable() (in
fact this is better since without batching, it may happen that some ptes
aren't changed to writable just because they are not dirty, even though
the other ptes mapping the same large folio are dirty).
To batch around the PageAnonExclusive case, we must check the
corresponding condition for every single page. Therefore, from the large
folio batch, we process sub batches of ptes mapping pages with the same
PageAnonExclusive condition, and process that sub batch, then determine
and process the next sub batch, and so on. Note that this does not cause
any extra overhead; if suppose the size of the folio batch is 512, then
the sub batch processing in total will take 512 iterations, which is the
same as what we would have done before.
For pte_needs_flush():
ppc does not care about the a/d bits.
For x86, PAGE_SAVED_DIRTY is ignored. We will flush only when a/d bits
get cleared; since we can only have extra a/d bits due to batching, we
will only have an extra flush, not a case where we elide a flush due to
batching when we shouldn't have.
Link: https://lkml.kernel.org/r/20250718090244.21092-7-dev.jain@arm.com
Signed-off-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <yang@os.amperecomputing.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Zhenhua Huang <quic_zhenhuah@quicinc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
In preparation for patch 6 and modularizing the code in general, split
can_change_pte_writable() into private and shared VMA parts. No
functional change intended.
Link: https://lkml.kernel.org/r/20250718090244.21092-6-dev.jain@arm.com
Signed-off-by: Dev Jain <dev.jain@arm.com>
Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <yang@os.amperecomputing.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Zhenhua Huang <quic_zhenhuah@quicinc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch 6 ("mm: Optimize mprotect() by PTE batching") optimizes mprotect()
by batch clearing the ptes, masking in the new protections, and batch
setting the ptes. Suppose that the first pte of the batch is writable -
with the current implementation of folio_pte_batch(), it is not guaranteed
that the other ptes in the batch are already writable too, so we may
incorrectly end up setting the writable bit on all ptes via
modify_prot_commit_ptes().
Therefore, introduce FPB_RESPECT_WRITE so that all ptes in the batch are
writable or not.
Link: https://lkml.kernel.org/r/20250718090244.21092-5-dev.jain@arm.com
Signed-off-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <yang@os.amperecomputing.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Zhenhua Huang <quic_zhenhuah@quicinc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Batch ptep_modify_prot_start/commit in preparation for optimizing
mprotect, implementing them as a simple loop over the corresponding single
pte helpers. Architecture may override these helpers.
Link: https://lkml.kernel.org/r/20250718090244.21092-4-dev.jain@arm.com
Signed-off-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Barry Song <baohua@kernel.org>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Joey Gouly <joey.gouly@arm.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <yang@os.amperecomputing.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Zhenhua Huang <quic_zhenhuah@quicinc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|