summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-05-19scsi: elx: efct: Remove NULL check after calling container_of()Haowen Bai
container_of() will never return NULL. Link: https://lore.kernel.org/r/1652750737-22673-1-git-send-email-baihaowen@meizu.com Signed-off-by: Haowen Bai <baihaowen@meizu.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-05-19scsi: dpt_i2o: Drop redundant spinlock initializationHaowen Bai
adpt_post_wait_lock was declared and initialized by DEFINE_SPINLOCK so we don't need to call spin_lock_init(). Drop the call. Link: https://lore.kernel.org/r/1652176024-3981-1-git-send-email-baihaowen@meizu.com Signed-off-by: Haowen Bai <baihaowen@meizu.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-05-19scsi: qedf: Remove redundant variable opColin Ian King
The variable 'op' is assigned a value and is never read. The variable is not used and is redundant, remove it. Link: https://lore.kernel.org/r/20220517092518.93159-1-colin.i.king@gmail.com Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-05-19scsi: hisi_sas: Fix memory ordering in hisi_sas_task_deliver()John Garry
The memories for the slot should be observed to be written prior to observing the slot as ready. Prior to commit 26fc0ea74fcb ("scsi: libsas: Drop SAS_TASK_AT_INITIATOR"), we had a spin_lock() + spin_unlock() immediately before marking the slot as ready. The spin_unlock() - with release semantics - caused the slot memory to be observed to be written. Now that the spin_lock() + spin_unlock() is gone, use a smp_wmb(). Link: https://lore.kernel.org/r/1652774661-12935-1-git-send-email-john.garry@huawei.com Fixes: 26fc0ea74fcb ("scsi: libsas: Drop SAS_TASK_AT_INITIATOR") Reported-by: Yihang Li <liyihang6@hisilicon.com> Tested-by: Yihang Li <liyihang6@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-05-19scsi: fnic: Replace DMA mask of 64 bits with 47 bitsKaran Tilak Kumar
Cisco VIC supports only 47 bits. If the host sends DMA addresses that are greater than 47 bits, it causes work queue (WQ) errors in the VIC. Link: https://lore.kernel.org/r/20220513205605.81788-1-kartilak@cisco.com Tested-by: Karan Tilak Kumar <kartilak@cisco.com> Co-developed-by: Dhanraj Jhawar <djhawar@cisco.com> Signed-off-by: Dhanraj Jhawar <djhawar@cisco.com> Co-developed-by: Sesidhar Baddela <sebaddel@cisco.com> Signed-off-by: Sesidhar Baddela <sebaddel@cisco.com> Signed-off-by: Arulprabhu Ponnusamy <arulponn@cisco.com> Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-05-19scsi: mpi3mr: Add target device related sysfs attributesSreekanth Reddy
Add sysfs attributes for exposing target device details such as SAS address, firmware device handle, and persistent ID for the controller-attached devices and RAID volumes. Link: https://lore.kernel.org/r/20220517115310.13062-3-sreekanth.reddy@broadcom.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-05-19scsi: mpi3mr: Add shost related sysfs attributesSreekanth Reddy
Add shost related sysfs attributes to display the controller's firmware version, queue depth, number of requests, and number of reply queues. Also add an attribute to set & get the logging_level. Link: https://lore.kernel.org/r/20220517115310.13062-2-sreekanth.reddy@broadcom.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-05-19scsi: elx: efct: Remove redundant memset() statementHarshit Mogalapalli
As memset() of bmbx is immediately followed by a memcpy() where bmbx is the destination, the memset() is redundant. Link: https://lore.kernel.org/r/20220505143703.45441-1-harshit.m.mogalapalli@oracle.com Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-05-19scsi: megaraid_sas: Remove redundant memset() statementHarshit Mogalapalli
As memset() of scmd->sense_buffer is immediately followed by a memcpy() where scmd->sense_buffer is the destination. The memset() is redundant. Link: https://lore.kernel.org/r/20220505143214.44908-1-harshit.m.mogalapalli@oracle.com Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-05-19scsi: mpi3mr: Return error if dma_alloc_coherent() failsDan Carpenter
Return -ENOMEM instead of success if dma_alloc_coherent() fails. Link: https://lore.kernel.org/r/YnOmMGHqCOtUCYQ1@kili Fixes: 43ca11005098 ("scsi: mpi3mr: Add support for PEL commands") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-05-19scsi: hisi_sas: Fix rescan after deleting a diskJohn Garry
Removing an ATA device via sysfs means that the device may not be found through re-scanning: root@ubuntu:/home/john# lsscsi [0:0:0:0] disk SanDisk LT0200MO P404 /dev/sda [0:0:1:0] disk ATA HGST HUS724040AL A8B0 /dev/sdb [0:0:8:0] enclosu 12G SAS Expander RevB - root@ubuntu:/home/john# echo 1 > /sys/block/sdb/device/delete root@ubuntu:/home/john# echo "- - -" > /sys/class/scsi_host/host0/scan root@ubuntu:/home/john# lsscsi [0:0:0:0] disk SanDisk LT0200MO P404 /dev/sda [0:0:8:0] enclosu 12G SAS Expander RevB - root@ubuntu:/home/john# The problem is that the rescan of the device may conflict with the device in being re-initialized, as follows: - In the rescan we call hisi_sas_slave_alloc() in store_scan() -> sas_user_scan() -> [__]scsi_scan_target() -> scsi_probe_and_add_lunc() -> scsi_alloc_sdev() -> hisi_sas_slave_alloc() -> hisi_sas_init_device() In hisi_sas_init_device() we issue an IT nexus reset for ATA devices - That IT nexus causes the remote PHY to go down and this triggers a bcast event - In parallel libsas processes the bcast event, finds that the phy is down and marks the device as gone The hard reset issued in hisi_sas_init_device() is unncessary - as described in the code comment - so remove it. Also set dev status as HISI_SAS_DEV_NORMAL as the hisi_sas_init_device() call. Link: https://lore.kernel.org/r/1652354134-171343-4-git-send-email-john.garry@huawei.com Fixes: 36c6b7613ef1 ("scsi: hisi_sas: Initialise devices in .slave_alloc callback") Tested-by: Yihang Li <liyihang6@hisilicon.com> Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-05-19scsi: hisi_sas: Use sas_ata_wait_after_reset() in IT nexus resetJohn Garry
We have seen errors like this when a SATA device is probed: [524.566298] hisi_sas_v3_hw 0000L74:02.0: erroneous completion iptt=4096 ... [524.582827] sas: TMF task open reject failed 500e004aaaaaaaa00 Since commit 21c7e972475e ("scsi: hisi_sas: Disable SATA disk phy for severe I_T nexus reset failure"), we issue an ATA softreset to disks after a phy reset to ensure that they are in sound working order. If the softreset is issued before the remote phy has come back up then the softreset will fail (errors as above). Remedy this by waiting for the phy to come back up after the reset. Link: https://lore.kernel.org/r/1652354134-171343-3-git-send-email-john.garry@huawei.com Tested-by: Yihang Li <liyihang6@hisilicon.com> Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-05-19scsi: libsas: Refactor sas_ata_hard_reset()John Garry
Create function sas_ata_wait_after_reset() from sas_ata_hard_reset() as some LLDDs may want to check for a remote ATA phy is up after reset. Link: https://lore.kernel.org/r/1652354134-171343-2-git-send-email-john.garry@huawei.com Tested-by: Yihang Li <liyihang6@hisilicon.com> Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-05-19scsi: mpt3sas: Update driver version to 42.100.00.00Sreekanth Reddy
Update driver version to 42.100.00.00. Link: https://lore.kernel.org/r/20220511072621.30657-2-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-05-19scsi: mpt3sas: Fix junk chars displayed while printing ChipNameSreekanth Reddy
Terminate string after copying 16 bytes of ChipName data from Manufacturing Page0 to prevent %s from printing junk characters. Link: https://lore.kernel.org/r/20220511072621.30657-1-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-05-19net: macb: Fix PTP one step sync supportHarini Katakam
PTP one step sync packets cannot have CSUM padding and insertion in SW since time stamp is inserted on the fly by HW. In addition, ptp4l version 3.0 and above report an error when skb timestamps are reported for packets that not processed for TX TS after transmission. Add a helper to identify PTP one step sync and fix the above two errors. Add a common mask for PTP header flag field "twoStepflag". Also reset ptp OSS bit when one step is not selected. Fixes: ab91f0a9b5f4 ("net: macb: Add hardware PTP support") Fixes: 653e92a9175e ("net: macb: add support for padding and fcs computation") Signed-off-by: Harini Katakam <harini.katakam@xilinx.com> Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com> Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20220518170756.7752-1-harini.katakam@xilinx.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-19clk: mediatek: mt8173: Switch to clk_hw provider APIsChen-Yu Tsai
As part of the effort to improve the MediaTek clk drivers, the next step is to switch from the old 'struct clk' clk prodivder APIs to the new 'struct clk_hw' ones. The MT8173 clk driver has one clk that is registered directly with the clk provider APIs, instead of going through the MediaTek clk library. Switch this instance to use the clk_hw provider API. Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: Miles Chen <miles.chen@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: Miles Chen <miles.chen@mediatek.com> Link: https://lore.kernel.org/r/20220519071610.423372-6-wenst@chromium.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-05-19clk: mediatek: Switch to clk_hw provider APIsChen-Yu Tsai
As part of the effort to improve the MediaTek clk drivers, the next step is to switch from the old 'struct clk' clk prodivder APIs to the new 'struct clk_hw' ones. In a previous patch, 'struct clk_onecell_data' was replaced with 'struct clk_hw_onecell_data', with (struct clk_hw *)->clk and __clk_get_hw() bridging the new data structures and old code. Now switch from the old 'clk_(un)?register*()' APIs to the new 'clk_hw_(un)?register*()' ones. This is done with the coccinelle script below. Unfortunately this also leaves clk-mt8173.c with a compile error that would need a coccinelle script longer than the actual diff to fix. This last part is fixed up by hand. // Fix prototypes @@ identifier F =~ "^mtk_clk_register_"; @@ - struct clk * + struct clk_hw * F(...); // Fix calls to mtk_clk_register_<singular> @ reg @ identifier F =~ "^mtk_clk_register_"; identifier FS =~ "^mtk_clk_register_[a-z_]*s"; identifier I; expression clk_data; expression E; @@ FS(...) { ... - struct clk *I; + struct clk_hw *hw; ... for (...;...;...) { ... ( - I + hw = - clk_register_fixed_rate( + clk_hw_register_fixed_rate( ... ); | - I + hw = - clk_register_fixed_factor( + clk_hw_register_fixed_factor( ... ); | - I + hw = - clk_register_divider( + clk_hw_register_divider( ... ); | - I + hw = F(...); ) ... if ( - IS_ERR(I) + IS_ERR(hw) ) { pr_err(..., - I + hw ,...); ... } - clk_data->hws[E] = __clk_get_hw(I); + clk_data->hws[E] = hw; } ... } @ depends on reg @ identifier reg.I; @@ return PTR_ERR( - I + hw ); // Fix mtk_clk_register_composite to return clk_hw instead of clk @@ identifier I, R; expression E; @@ - struct clk * + struct clk_hw * mtk_clk_register_composite(...) { ... - struct clk *I; + struct clk_hw *hw; ... - I = clk_register_composite( + hw = clk_hw_register_composite( ...); if (IS_ERR( - I + hw )) { ... R = PTR_ERR( - I + hw ); ... } return - I + hw ; ... } // Fix other mtk_clk_register_<singular> to return clk_hw instead of clk @@ identifier F =~ "^mtk_clk_register_"; identifier I, D, C; expression E; @@ - struct clk * + struct clk_hw * F(...) { ... - struct clk *I; + int ret; ... - I = clk_register(D, E); + ret = clk_hw_register(D, E); ... ( - if (IS_ERR(I)) + if (ret) { kfree(C); + return ERR_PTR(ret); + } | - if (IS_ERR(I)) + if (ret) { kfree(C); - return I; + return ERR_PTR(ret); } ) - return I; + return E; } // Fix mtk_clk_unregister_<singular> to take clk_hw instead of clk @@ identifier F =~ "^mtk_clk_unregister_"; identifier I, I2; @@ static void F( - struct clk *I + struct clk_hw *I2 ) { ... - struct clk_hw *I2; ... - I2 = __clk_get_hw(I); ... ( - clk_unregister(I); + clk_hw_unregister(I2); | - clk_unregister_composite(I); + clk_hw_unregister_composite(I2); ) ... } // Fix calls to mtk_clk_unregister_*() @@ identifier F =~ "^mtk_clk_unregister_"; expression I; expression E; @@ - F(I->hws[E]->clk); + F(I->hws[E]); Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: Miles Chen <miles.chen@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: Miles Chen <miles.chen@mediatek.com> Link: https://lore.kernel.org/r/20220519071610.423372-5-wenst@chromium.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-05-19clk: mediatek: Replace 'struct clk' with 'struct clk_hw'Chen-Yu Tsai
As part of the effort to improve the MediaTek clk drivers, the next step is to switch from the old 'struct clk' clk prodivder APIs to the new 'struct clk_hw' ones. Instead of adding new APIs to the MediaTek clk driver library mirroring the existing ones, moving all drivers to the new APIs, and then removing the old ones, just migrate everything at the same time. This involves replacing 'struct clk' with 'struct clk_hw', and 'struct clk_onecell_data' with 'struct clk_hw_onecell_data', and fixing up all usages. For now, the clk_register() and co. usage is retained, with __clk_get_hw() and (struct clk_hw *)->clk used to bridge the difference between the APIs. These will be replaced in subsequent patches. Fix up mtk_{alloc,free}_clk_data to use 'struct clk_hw' by hand. Fix up all other affected call sites with the following coccinelle script. // Replace type @@ @@ - struct clk_onecell_data + struct clk_hw_onecell_data // Replace of_clk_add_provider() & of_clk_src_simple_get() @@ expression NP, DATA; symbol of_clk_src_onecell_get; @@ - of_clk_add_provider( + of_clk_add_hw_provider( NP, - of_clk_src_onecell_get, + of_clk_hw_onecell_get, DATA ) // Fix register/unregister @@ identifier CD; expression E; identifier fn =~ "unregister"; @@ fn(..., - CD->clks[E] + CD->hws[E]->clk ,... ); // Fix calls to clk_prepare_enable() @@ identifier CD; expression E; @@ clk_prepare_enable( - CD->clks[E] + CD->hws[E]->clk ); // Fix pointer assignment @@ identifier CD; identifier CLK; expression E; @@ - CD->clks[E] + CD->hws[E] = ( - CLK + __clk_get_hw(CLK) | ERR_PTR(...) ) ; // Fix pointer usage @@ identifier CD; expression E; @@ - CD->clks[E] + CD->hws[E] // Fix mtk_clk_pll_get_base() @@ symbol clk, hw, data; @@ mtk_clk_pll_get_base( - struct clk *clk, + struct clk_hw *hw, const struct mtk_pll_data *data ) { - struct clk_hw *hw = __clk_get_hw(clk); ... } // Fix mtk_clk_pll_get_base() usage @@ identifier CD; expression E; @@ mtk_clk_pll_get_base( - CD->clks[E] + CD->hws[E]->clk ,... ); Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Miles Chen <miles.chen@mediatek.com> Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: Miles Chen <miles.chen@mediatek.com> Link: https://lore.kernel.org/r/20220519071610.423372-4-wenst@chromium.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-05-19clk: mediatek: apmixed: Drop error message from clk_register() failureChen-Yu Tsai
mtk_clk_register_ref2usb_tx() prints an error message if clk_register() fails. It doesn't if kzalloc() fails though. The caller would then tack on its own error message to handle this. Also, All other clk registration functions in the MediaTek clk library leave the error message printing to the bulk registration functions, while the helpers that register individual clks just return error codes. Drop the error message that is printed when clk_register() fails in mtk_clk_register_ref2usb_tx() to make its behavior consistent both across its failure modes, and with the rest of the driver library. Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: Miles Chen <miles.chen@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: Miles Chen <miles.chen@mediatek.com> Link: https://lore.kernel.org/r/20220519071610.423372-3-wenst@chromium.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-05-19clk: mediatek: Make mtk_clk_register_composite() staticChen-Yu Tsai
mtk_clk_register_composite() is not used anywhere outside of the file it is defined. Make it static. Fixes: 9741b1a68035 ("clk: mediatek: Add initial common clock support for Mediatek SoCs.") Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: Miles Chen <miles.chen@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: Miles Chen <miles.chen@mediatek.com> Link: https://lore.kernel.org/r/20220519071610.423372-2-wenst@chromium.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-05-19Merge tag 'clk-v5.19-samsung' of ↵Stephen Boyd
https://git.kernel.org/pub/scm/linux/kernel/git/snawrocki/clk into clk-samsung Pull Samsung clk driver updates from Sylwester Nawrocki: - clock driver for exynosautov9 SoC * tag 'clk-v5.19-samsung' of https://git.kernel.org/pub/scm/linux/kernel/git/snawrocki/clk: clk: samsung: exynosautov9: add cmu_peric1 clock support clk: samsung: exynosautov9: add cmu_peric0 clock support clk: samsung: exynosautov9: add cmu_fsys2 clock support clk: samsung: exynosautov9: add cmu_busmc clock support clk: samsung: exynosautov9: add cmu_peris clock support clk: samsung: exynosautov9: add cmu_core clock support clk: samsung: add top clock support for Exynos Auto v9 SoC dt-bindings: clock: add Exynos Auto v9 SoC CMU bindings dt-bindings: clock: add clock binding definitions for Exynos Auto v9
2022-05-19Merge tag 'linux-can-next-for-5.19-20220519' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== pull-request: can-next 2022-05-19 Oliver Hartkopp contributes a patch for the ISO-TP CAN protocol to update the validation of address information during bind. The next patch is by Jakub Kicinski and converts the CAN network drivers from netif_napi_add() to the netif_napi_add_weight() function. Another patch by Oliver Hartkopp removes obsolete CAN specific LED support. Vincent Mailhol's patch for the mcp251xfd driver fixes a -Wunaligned-access warning by clang-14. * tag 'linux-can-next-for-5.19-20220519' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next: can: mcp251xfd: silence clang's -Wunaligned-access warning can: can-dev: remove obsolete CAN LED support can: can-dev: move to netif_napi_add_weight() can: isotp: isotp_bind(): do not validate unused address information ==================== Link: https://lore.kernel.org/r/20220519202308.1435903-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-19riscv: dts: microchip: fix gpio1 reg property typoConor Paxton
Fix reg address typo in the gpio1 stanza. Signed-off-by: Conor Paxton <conor.paxton@microchip.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Fixes: 528a5b1f2556 ("riscv: dts: microchip: add new peripherals to icicle kit device tree") Link: https://lore.kernel.org/r/20220517104058.2004734-1-conor.paxton@microchip.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-05-19dt-bindings: clock: Replace common binding with link to schemaRob Herring
The contents of the clock binding have been moved to the clock binding schema in the dtschema repository. The desire is for common bindings to be hosted in the dtschema repository. Replace the contents with a link to the clock binding schema as there are still many references to clock-bindings.txt in the tree. This will prevent additions without a schema. Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20220428154154.2284317-1-robh@kernel.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-05-19x86/PCI: Disable E820 reserved region clipping starting in 2023Hans de Goede
Some firmware includes unusable space (host bridge registers, hidden PCI device BARs, etc) in PCI host bridge _CRS. As far as we know, there's nothing in the ACPI, UEFI, or PCI Firmware spec that requires the OS to remove E820 reserved regions from _CRS, so this seems like a firmware defect. As a workaround, 4dc2287c1805 ("x86: avoid E820 regions when allocating address space") has clipped out the unusable space in the past. This is required for machines like the following: - Dell Precision T3500 (the original motivator for 4dc2287c1805); see https://bugzilla.kernel.org/show_bug.cgi?id=16228 - Asus C523NA (Coral) Chromebook; see https://lore.kernel.org/all/4e9fca2f-0af1-3684-6c97-4c35befd5019@redhat.com/ - Lenovo ThinkPad X1 Gen 2; see: https://bugzilla.redhat.com/show_bug.cgi?id=2029207 But other firmware supplies E820 reserved regions that cover entire _CRS windows, and clipping throws away the entire window, leaving none for hot-added or uninitialized devices. This clipping breaks a whole range of Lenovo IdeaPads, Yogas, Yoga Slims, and notebooks, as well as Acer Spin 5 and Clevo X170KM-G Barebone machines. E820 reserved entries that cover a memory-mapped PCI host bridge, including its registers and memory/IO windows, are probably *not* a firmware defect. Per ACPI v5.4, sec 15.2, the E820 memory map may include: Address ranges defined for baseboard memory-mapped I/O devices, such as APICs, are returned as reserved. Disable the E820 clipping by default for all post-2022 machines. We already have quirks to disable clipping for pre-2023 machines, and we'll likely need quirks to *enable* clipping for post-2022 machines that incorrectly include unusable space in _CRS, including Chromebooks and Lenovo ThinkPads. Here's the rationale for doing this. If we do nothing, and continue clipping by default: - Future systems like the Lenovo IdeaPads, Yogas, etc, Acer Spin, and Clevo Barebones will require new quirks to disable clipping. - The problem here is E820 entries that cover entire _CRS windows that should not be clipped out. - I think these E820 entries are legal per spec, and it would be hard to get BIOS vendors to change them. - We will discover new systems that need clipping disabled piecemeal as they are released. - Future systems like Lenovo X1 Carbon and the Chromebooks (probably anything using coreboot) will just work, even though their _CRS is incorrect, so we will not notice new ones that rely on the clipping. - BIOS updates will not require new quirks unless they change the DMI model string. If we add the date check in this commit that disables clipping, e.g., "no clipping when date >= 2023": - Future systems like Lenovo *IIL*, Acer Spin, and Clevo Barebones will just work without new quirks. - Future systems like Lenovo X1 Carbon and the Chromebooks will require new quirks to *enable* clipping. - The problem here is that _CRS contains regions that are not usable by PCI devices, and we rely on the E820 kludge to clip them out. - I think this use of E820 is clearly a firmware bug, so we have a fighting chance of getting it changed eventually. - BIOS updates after the cutoff date *will* require quirks, but only for systems like Lenovo X1 Carbon and Chromebooks that we already think have broken firmware. It seems to me like it's better to add quirks for firmware that we think is broken than for firmware that seems unusual but correct. [bhelgaas: comment and commit log] Link: https://lore.kernel.org/linux-pci/20220518220754.GA7911@bhelgaas/ Link: https://lore.kernel.org/r/20220519152150.6135-4-hdegoede@redhat.com Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Benoit Grégoire <benoitg@coeus.ca> Cc: Hui Wang <hui.wang@canonical.com>
2022-05-19x86/PCI: Disable E820 reserved region clipping via quirksHans de Goede
To avoid unusable space that some firmware includes in PCI host bridge _CRS, Linux currently excludes E820 reserved regions from _CRS windows; see 4dc2287c1805 ("x86: avoid E820 regions when allocating address space"). However, some systems supply E820 reserved regions that cover the entire memory window from _CRS, so clipping them out leaves no space for hot-added or uninitialized PCI devices. For example, from a Lenovo IdeaPad 3 15IIL 81WE: BIOS-e820: [mem 0x4bc50000-0xcfffffff] reserved pci_bus 0000:00: root bus resource [mem 0x65400000-0xbfffffff window] pci 0000:00:15.0: BAR 0: [mem 0x00000000-0x00000fff 64bit] pci 0000:00:15.0: BAR 0: no space for [mem size 0x00001000 64bit] Add quirks to disable the E820 clipping for machines known to do this. A single DMI_PRODUCT_VERSION "IIL" quirk matches all the below: Lenovo IdeaPad 3 14IIL05 Lenovo IdeaPad 3 15IIL05 Lenovo IdeaPad 3 17IIL05 Lenovo IdeaPad 5 14IIL05 Lenovo IdeaPad 5 15IIL05 Lenovo IdeaPad Slim 7 14IIL05 Lenovo IdeaPad Slim 7 15IIL05 Lenovo IdeaPad S145-15IIL Lenovo IdeaPad S340-14IIL Lenovo IdeaPad S340-15IIL Lenovo IdeaPad C340-15IIL Lenovo BS145-15IIL Lenovo V14-IIL Lenovo V15-IIL Lenovo V17-IIL Lenovo Yoga C940-14IIL Lenovo Yoga S740-14IIL Lenovo Yoga Slim 7 14IIL05 Lenovo Yoga Slim 7 15IIL05 in addition to the following that don't actually need it because they have no E820 reserved regions that overlap _CRS windows: Lenovo IdeaPad Flex 5 14IIL05 Lenovo IdeaPad Flex 5 15IIL05 Lenovo ThinkBook 14-IIL Lenovo ThinkBook 15-IIL Lenovo Yoga S940-14IIL Other quirks match these: Acer Spin 5 (SP513-54N) Clevo X170KM-G Barebone Link: https://bugzilla.kernel.org/show_bug.cgi?id=206459 Lenovo Yoga C940-14IIL Link: https://bugzilla.kernel.org/show_bug.cgi?id=214259 Clevo X170KM Barebone Link: https://bugzilla.redhat.com/show_bug.cgi?id=1868899 Lenovo IdeaPad 3 15IIL05 Link: https://bugzilla.redhat.com/show_bug.cgi?id=1871793 Lenovo IdeaPad 5 14IIL05 Link: https://bugs.launchpad.net/bugs/1878279 Lenovo IdeaPad 5 14IIL05 Link: https://bugs.launchpad.net/bugs/1880172 Lenovo IdeaPad 3 14IIL05 Link: https://bugs.launchpad.net/bugs/1884232 Acer Spin SP513-54N Link: https://bugs.launchpad.net/bugs/1921649 Lenovo IdeaPad S145 Link: https://bugs.launchpad.net/bugs/1931715 Lenovo IdeaPad S145 Link: https://bugs.launchpad.net/bugs/1932069 Lenovo BS145-15IIL Link: https://lore.kernel.org/r/20220519152150.6135-3-hdegoede@redhat.com Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Benoit Grégoire <benoitg@coeus.ca> Cc: Hui Wang <hui.wang@canonical.com>
2022-05-19perf/x86/amd/core: Fix reloading events for SVMSandipan Das
Commit 1018faa6cf23 ("perf/x86/kvm: Fix Host-Only/Guest-Only counting with SVM disabled") addresses an issue in which the Host-Only bit in the counter control registers needs to be masked off when SVM is not enabled. The events need to be reloaded whenever SVM is enabled or disabled for a CPU and this requires the PERF_CTL registers to be reprogrammed using {enable,disable}_all(). However, PerfMonV2 variants of these functions do not reprogram the PERF_CTL registers. Hence, the legacy enable_all() function should also be called. Fixes: 9622e67e3980 ("perf/x86/amd/core: Add PerfMonV2 counter control") Reported-by: Like Xu <likexu@tencent.com> Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220518084327.464005-1-sandipan.das@amd.com
2022-05-19topology: Remove unused cpu_cluster_mask()Dietmar Eggemann
default_topology[] uses cpu_clustergroup_mask() for the CLS level (guarded by CONFIG_SCHED_CLUSTER) which is currently provided by x86 (arch/x86/kernel/smpboot.c) and arm64 (drivers/base/arch_topology.c). Fixes: 778c558f49a2c ("sched: Add cluster scheduler level in core and related Kconfig for ARM64") Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Barry Song <baohua@kernel.org> Link: https://lore.kernel.org/r/20220513093433.425163-1-dietmar.eggemann@arm.com
2022-05-19sched: Reverse sched_class layoutPeter Zijlstra
Because GCC-12 is fully stupid about array bounds and it's just really hard to get a solid array definition from a linker script, flip the array order to avoid needing negative offsets :-/ This makes the whole relational pointer magic a little less obvious, but alas. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/YoOLLmLG7HRTXeEm@hirez.programming.kicks-ass.net
2022-05-19bug: Use normal relative pointers in 'struct bug_entry'Josh Poimboeuf
With CONFIG_GENERIC_BUG_RELATIVE_POINTERS, the addr/file relative pointers are calculated weirdly: based on the beginning of the bug_entry struct address, rather than their respective pointer addresses. Make the relative pointers less surprising to both humans and tools by calculating them the normal way. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Sven Schnelle <svens@linux.ibm.com> # s390 Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Acked-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64] Link: https://lkml.kernel.org/r/f0e05be797a16f4fc2401eeb88c8450dcbe61df6.1652362951.git.jpoimboe@kernel.org
2022-05-19sched/clock: Use try_cmpxchg64 in sched_clock_{local,remote}Uros Bizjak
Use try_cmpxchg64 instead of cmpxchg64 (*ptr, old, new) != old in sched_clock_{local,remote}. x86 cmpxchg returns success in ZF flag, so this change saves a compare after cmpxchg (and related move instruction in front of cmpxchg). Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220518184953.3446778-1-ubizjak@gmail.com
2022-05-19x86/entry: Fix register corruption in compat syscallJosh Poimboeuf
A panic was reported in the init process on AMD: Run /sbin/init as init process init[1]: segfault at f7fd5ca0 ip 00000000f7f5bbc7 sp 00000000ffa06aa0 error 7 in libc.so[f7f51000+4e000] Code: 8a 44 24 10 88 41 ff 8b 44 24 10 83 c4 2c 5b 5e 5f 5d c3 53 83 ec 08 8b 5c 24 10 81 fb 00 f0 ff ff 76 0c e8 ba dc ff ff f7 db <89> 18 83 cb ff 83 c4 08 89 d8 5b c3 e8 81 60 ff ff 05 28 84 07 00 Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b CPU: 1 PID: 1 Comm: init Tainted: G W 5.18.0-rc7-next-20220519 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x57/0x7d panic+0x10f/0x28d do_exit.cold+0x18/0x48 do_group_exit+0x2e/0xb0 get_signal+0xb6d/0xb80 arch_do_signal_or_restart+0x31/0x760 ? show_opcodes.cold+0x1c/0x21 ? force_sig_fault+0x49/0x70 exit_to_user_mode_prepare+0x131/0x1a0 irqentry_exit_to_user_mode+0x5/0x30 asm_exc_page_fault+0x27/0x30 RIP: 0023:0xf7f5bbc7 Code: 8a 44 24 10 88 41 ff 8b 44 24 10 83 c4 2c 5b 5e 5f 5d c3 53 83 ec 08 8b 5c 24 10 81 fb 00 f0 ff ff 76 0c e8 ba dc ff ff f7 db <89> 18 83 cb ff 83 c4 08 89 d8 5b c3 e8 81 60 ff ff 05 28 84 07 00 RSP: 002b:00000000ffa06aa0 EFLAGS: 00000217 RAX: 00000000f7fd5ca0 RBX: 000000000000000c RCX: 0000000000001000 RDX: 0000000000000001 RSI: 00000000f7fd5b60 RDI: 00000000f7fd5b60 RBP: 00000000f7fd1c1c R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK> The task's CX register got corrupted by commit 8c42819b61b8 ("x86/entry: Use PUSH_AND_CLEAR_REGS for compat"), which overlooked the fact that compat SYSCALL apparently stores the user's CX value in BP. Before that commit, CX was saved from its stashed value in BP: pushq %rbp /* pt_regs->cx (stashed in bp) */ But then it got changed to: pushq %rcx /* pt_regs->cx */ So the wrong value got saved and later restored back to the user. Fix it by pushing the correct value again (BP) for regs->cx. Fixes: 8c42819b61b8 ("x86/entry: Use PUSH_AND_CLEAR_REGS for compat") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Guenter Roeck <linux@roeck-us.net> Link: https://lkml.kernel.org/r/b5a26592c9dd60bbacdf97974a7433fd802a5593.1652985970.git.jpoimboe@kernel.org
2022-05-19clk: qcom: rcg2: Cache CFG register updates for parked RCGsBjorn Andersson
As GDSCs are turned on and off some associated clocks are momentarily enabled for house keeping purposes. For this, and similar, purposes the "shared RCGs" will park the RCG on a source clock which is known to be available. When the RCG is parked, a safe clock source will be selected and committed, then the original source would be written back and upon enable the change back to the unparked source would be committed. But starting with SM8350 this fails, as the value in CFG is committed by the GDSC handshake and without a ticking parent the GDSC enablement will time out. This becomes a concrete problem if the runtime supended state of a device includes disabling such rcg's parent clock. As the device attempts to power up the domain again the rcg will fail to enable and hence the GDSC enablement will fail, preventing the device from returning from the suspended state. This can be seen in e.g. the display stack during probe on SM8350. To avoid this problem, the software needs to ensure that the RCG is configured to a active parent clock while it is disabled. This is done by caching the CFG register content while the shared RCG is parked on this safe source. Writes to M, N and D registers are committed as they are requested. New helpers for get_parent() and recalc_rate() are extracted from their previous implementations and __clk_rcg2_configure() is modified to allow it to operate on the cached value. Fixes: 7ef6f11887bd ("clk: qcom: Configure the RCGs to a safe source as needed") Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Stephen Boyd <sboyd@kernel.org> Link: https://lore.kernel.org/r/20220426212136.1543984-1-bjorn.andersson@linaro.org
2022-05-19clk: qcom: add sc8280xp GCC driverBjorn Andersson
Add support for the Global Clock Controller found in the Qualcomm SC8280XP platform. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Vinod Koul <vkoul@kernel.org> Reviewed-by: Stephen Boyd <sboyd@kernel.org> Link: https://lore.kernel.org/r/20220505025457.1693716-3-bjorn.andersson@linaro.org
2022-05-19dt-bindings: clock: Add Qualcomm SC8280XP GCC bindingsBjorn Andersson
Add binding for the Qualcomm SC8280XP Global Clock controller. The clock-names property is purposefully omitted, to clearly communicate to the writer (and reader) of the DeviceTree source based on this binding that the order of "clocks" is significant, in contrast to previous GCC bindings. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Stephen Boyd <sboyd@kernel.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20220505025457.1693716-2-bjorn.andersson@linaro.org
2022-05-19fs/ntfs: remove redundant variable idxColin Ian King
The variable idx is assigned a value and is never read. The variable is not used and is redundant, remove it. Cleans up clang scan build warning: warning: Although the value stored to 'idx' is used in the enclosing expression, the value is never actually read from 'idx' [deadcode.DeadStores] Link: https://lkml.kernel.org/r/20220517093646.93628-2-colin.i.king@gmail.com Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Anton Altaparmakov <anton@tuxera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19fat: remove time truncations in vfat_create/vfat_mkdirChung-Chiang Cheng
All the timestamps in vfat_create() and vfat_mkdir() come from fat_time_fat2unix() which ensures time granularity. We don't need to truncate them to fit FAT's format. Moreover, fat_truncate_crtime() and fat_timespec64_trunc_10ms() are also removed because there is no caller anymore. Link: https://lkml.kernel.org/r/20220503152536.2503003-4-cccheng@synology.com Signed-off-by: Chung-Chiang Cheng <cccheng@synology.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19fat: report creation time in statxChung-Chiang Cheng
creation time is no longer mixed with change time. Add an in-memory field for it, and report it in statx if supported. Link: https://lkml.kernel.org/r/20220503152536.2503003-3-cccheng@synology.com Signed-off-by: Chung-Chiang Cheng <cccheng@synology.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19fat: ignore ctime updates, and keep ctime identical to mtime in memoryChung-Chiang Cheng
FAT supports creation time but not change time, and there was no corresponding timestamp for creation time in previous VFS. The original implementation took the compromise of saving the in-memory change time into the on-disk creation time field, but this would lead to compatibility issues with non-linux systems. To address this issue, this patch changes the behavior of ctime. It will no longer be loaded and stored from the creation time on disk. Instead of that, it'll be consistent with the in-memory mtime and share the same on-disk field. All updates to mtime will also be applied to ctime in memory, while all updates to ctime will be ignored. Link: https://lkml.kernel.org/r/20220503152536.2503003-2-cccheng@synology.com Signed-off-by: Chung-Chiang Cheng <cccheng@synology.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19fat: split fat_truncate_time() into separate functionsChung-Chiang Cheng
Separate fat_truncate_time() to each timestamps for later creation time work. This patch does not introduce any functional changes, it's merely refactoring change. Link: https://lkml.kernel.org/r/20220503152536.2503003-1-cccheng@synology.com Signed-off-by: Chung-Chiang Cheng <cccheng@synology.com> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19MAINTAINERS: add Muchun as a memcg reviewerMuchun Song
I have been focusing on mm for the past two years. e.g. developing, fixing bugs, reviewing. I have fixed lots of races (including memcg). I would like to help people working on memcg or related by reviewing their work. Let me be Cc'd on patches related to memcg. Link: https://lkml.kernel.org/r/20220517143320.99649-1-songmuchun@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Shakeel Butt <shakeelb@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: FanJun Kong <bh1scw@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19mm: damon: use HPAGE_PMD_SIZEKefeng Wang
Use HPAGE_PMD_SIZE instead of open coding. Link: https://lkml.kernel.org/r/20220517145120.118523-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19tracing: incorrect isolate_mote_t cast in mm_vmscan_lru_isolateVasily Averin
Fixes following sparse warnings: CHECK mm/vmscan.c mm/vmscan.c: note: in included file (through include/trace/trace_events.h, include/trace/define_trace.h, include/trace/events/vmscan.h): ./include/trace/events/vmscan.h:281:1: sparse: warning: cast to restricted isolate_mode_t ./include/trace/events/vmscan.h:281:1: sparse: warning: restricted isolate_mode_t degrades to integer Link: https://lkml.kernel.org/r/e85d7ff2-fd10-53f8-c24e-ba0458439c1b@openvz.org Signed-off-by: Vasily Averin <vvs@openvz.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19nodemask.h: fix compilation error with GCC12Christophe de Dinechin
With gcc version 12.0.1 20220401 (Red Hat 12.0.1-0), building with defconfig results in the following compilation error: | CC mm/swapfile.o | mm/swapfile.c: In function `setup_swap_info': | mm/swapfile.c:2291:47: error: array subscript -1 is below array bounds | of `struct plist_node[]' [-Werror=array-bounds] | 2291 | p->avail_lists[i].prio = 1; | | ~~~~~~~~~~~~~~^~~ | In file included from mm/swapfile.c:16: | ./include/linux/swap.h:292:27: note: while referencing `avail_lists' | 292 | struct plist_node avail_lists[]; /* | | ^~~~~~~~~~~ This is due to the compiler detecting that the mask in node_states[__state] could theoretically be zero, which would lead to first_node() returning -1 through find_first_bit. I believe that the warning/error is legitimate. I first tried adding a test to check that the node mask is not emtpy, since a similar test exists in the case where MAX_NUMNODES == 1. However, adding the if statement causes other warnings to appear in for_each_cpu_node_but, because it introduces a dangling else ambiguity. And unfortunately, GCC is not smart enough to detect that the added test makes the case where (node) == -1 impossible, so it still complains with the same message. This is why I settled on replacing that with a harmless, but relatively useless (node) >= 0 test. Based on the warning for the dangling else, I also decided to fix the case where MAX_NUMNODES == 1 by moving the condition inside the for loop. It will still only be tested once. This ensures that the meaning of an else following for_each_node_mask or derivatives would not silently have a different meaning depending on the configuration. Link: https://lkml.kernel.org/r/20220414150855.2407137-3-dinechin@redhat.com Signed-off-by: Christophe de Dinechin <christophe@dinechin.org> Signed-off-by: Christophe de Dinechin <dinechin@redhat.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Ben Segall <bsegall@google.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Zhen Lei <thunder.leizhen@huawei.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19mm: fix missing handler for __GFP_NOWARNQi Zheng
We expect no warnings to be issued when we specify __GFP_NOWARN, but currently in paths like alloc_pages() and kmalloc(), there are still some warnings printed, fix it. But for some warnings that report usage problems, we don't deal with them. If such warnings are printed, then we should fix the usage problems. Such as the following case: WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1)); [zhengqi.arch@bytedance.com: v2] Link: https://lkml.kernel.org/r/20220511061951.1114-1-zhengqi.arch@bytedance.com Link: https://lkml.kernel.org/r/20220510113809.80626-1-zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Akinobu Mita <akinobu.mita@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Slaby <jirislaby@kernel.org> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19mm/page_alloc: fix tracepoint mm_page_alloc_zone_locked()Wonhyuk Yang
Currently, trace point mm_page_alloc_zone_locked() doesn't show correct information. First, when alloc_flag has ALLOC_HARDER/ALLOC_CMA, page can be allocated from MIGRATE_HIGHATOMIC/MIGRATE_CMA. Nevertheless, tracepoint use requested migration type not MIGRATE_HIGHATOMIC and MIGRATE_CMA. Second, after commit 44042b4498728 ("mm/page_alloc: allow high-order pages to be stored on the per-cpu lists") percpu-list can store high order pages. But trace point determine whether it is a refiil of percpu-list by comparing requested order and 0. To handle these problems, make mm_page_alloc_zone_locked() only be called by __rmqueue_smallest with correct migration type. With a new argument called percpu_refill, it can show roughly whether it is a refill of percpu-list. Link: https://lkml.kernel.org/r/20220512025307.57924-1-vvghjk1234@gmail.com Signed-off-by: Wonhyuk Yang <vvghjk1234@gmail.com> Acked-by: Mel Gorman <mgorman@suse.de> Cc: Baik Song An <bsahn@etri.re.kr> Cc: Hong Yeon Kim <kimhy@etri.re.kr> Cc: Taeung Song <taeung@reallinux.co.kr> Cc: <linuxgeek@linuxgeek.io> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19mm/page_owner.c: add missing __initdata attributeFanjun Kong
This patch fixes two issues: 1. Add __initdata attribute according to include/linux/init.h: For initialized data: You should insert __initdata between the variable name and equal sign followed by value 2. Fix below error reported by checkpatch.pl: ERROR: do not initialise statics to false Special thanks to Muchun Song :) Link: https://lkml.kernel.org/r/20220516030039.1487005-1-bh1scw@gmail.com Signed-off-by: Fanjun Kong <bh1scw@gmail.com> Suggested-by: Muchun Song <songmuchun@bytedance.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19tmpfs: fix undefined-behaviour in shmem_reconfigure()Luo Meng
When shmem_reconfigure() calls __percpu_counter_compare(), the second parameter is unsigned long long. But in the definition of __percpu_counter_compare(), the second parameter is s64. So when __percpu_counter_compare() executes abs(count - rhs), UBSAN shows the following warning: ================================================================================ UBSAN: Undefined behaviour in lib/percpu_counter.c:209:6 signed integer overflow: 0 - -9223372036854775808 cannot be represented in type 'long long int' CPU: 1 PID: 9636 Comm: syz-executor.2 Tainted: G ---------r- - 4.18.0 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 Call Trace: __dump_stack home/install/linux-rh-3-10/lib/dump_stack.c:77 [inline] dump_stack+0x125/0x1ae home/install/linux-rh-3-10/lib/dump_stack.c:117 ubsan_epilogue+0xe/0x81 home/install/linux-rh-3-10/lib/ubsan.c:159 handle_overflow+0x19d/0x1ec home/install/linux-rh-3-10/lib/ubsan.c:190 __percpu_counter_compare+0x124/0x140 home/install/linux-rh-3-10/lib/percpu_counter.c:209 percpu_counter_compare home/install/linux-rh-3-10/./include/linux/percpu_counter.h:50 [inline] shmem_remount_fs+0x1ce/0x6b0 home/install/linux-rh-3-10/mm/shmem.c:3530 do_remount_sb+0x11b/0x530 home/install/linux-rh-3-10/fs/super.c:888 do_remount home/install/linux-rh-3-10/fs/namespace.c:2344 [inline] do_mount+0xf8d/0x26b0 home/install/linux-rh-3-10/fs/namespace.c:2844 ksys_mount+0xad/0x120 home/install/linux-rh-3-10/fs/namespace.c:3075 __do_sys_mount home/install/linux-rh-3-10/fs/namespace.c:3089 [inline] __se_sys_mount home/install/linux-rh-3-10/fs/namespace.c:3086 [inline] __x64_sys_mount+0xbf/0x160 home/install/linux-rh-3-10/fs/namespace.c:3086 do_syscall_64+0xca/0x5c0 home/install/linux-rh-3-10/arch/x86/entry/common.c:298 entry_SYSCALL_64_after_hwframe+0x6a/0xdf RIP: 0033:0x46b5e9 Code: 5d db fa ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b db fa ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f54d5f22c68 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 000000000077bf60 RCX: 000000000046b5e9 RDX: 0000000000000000 RSI: 0000000020000000 RDI: 0000000000000000 RBP: 000000000077bf60 R08: 0000000020000140 R09: 0000000000000000 R10: 00000000026740a4 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffd1fb1592f R14: 00007f54d5f239c0 R15: 000000000077bf6c ================================================================================ [akpm@linux-foundation.org: tweak error message text] Link: https://lkml.kernel.org/r/20220513025225.2678727-1-luomeng12@huawei.com Signed-off-by: Luo Meng <luomeng12@huawei.com> Cc: Hugh Dickins <hughd@google.com> Cc: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-19mm/mempolicy: fix uninit-value in mpol_rebind_policy()Wang Cheng
mpol_set_nodemask()(mm/mempolicy.c) does not set up nodemask when pol->mode is MPOL_LOCAL. Check pol->mode before access pol->w.cpuset_mems_allowed in mpol_rebind_policy()(mm/mempolicy.c). BUG: KMSAN: uninit-value in mpol_rebind_policy mm/mempolicy.c:352 [inline] BUG: KMSAN: uninit-value in mpol_rebind_task+0x2ac/0x2c0 mm/mempolicy.c:368 mpol_rebind_policy mm/mempolicy.c:352 [inline] mpol_rebind_task+0x2ac/0x2c0 mm/mempolicy.c:368 cpuset_change_task_nodemask kernel/cgroup/cpuset.c:1711 [inline] cpuset_attach+0x787/0x15e0 kernel/cgroup/cpuset.c:2278 cgroup_migrate_execute+0x1023/0x1d20 kernel/cgroup/cgroup.c:2515 cgroup_migrate kernel/cgroup/cgroup.c:2771 [inline] cgroup_attach_task+0x540/0x8b0 kernel/cgroup/cgroup.c:2804 __cgroup1_procs_write+0x5cc/0x7a0 kernel/cgroup/cgroup-v1.c:520 cgroup1_tasks_write+0x94/0xb0 kernel/cgroup/cgroup-v1.c:539 cgroup_file_write+0x4c2/0x9e0 kernel/cgroup/cgroup.c:3852 kernfs_fop_write_iter+0x66a/0x9f0 fs/kernfs/file.c:296 call_write_iter include/linux/fs.h:2162 [inline] new_sync_write fs/read_write.c:503 [inline] vfs_write+0x1318/0x2030 fs/read_write.c:590 ksys_write+0x28b/0x510 fs/read_write.c:643 __do_sys_write fs/read_write.c:655 [inline] __se_sys_write fs/read_write.c:652 [inline] __x64_sys_write+0xdb/0x120 fs/read_write.c:652 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x44/0xae Uninit was created at: slab_post_alloc_hook mm/slab.h:524 [inline] slab_alloc_node mm/slub.c:3251 [inline] slab_alloc mm/slub.c:3259 [inline] kmem_cache_alloc+0x902/0x11c0 mm/slub.c:3264 mpol_new mm/mempolicy.c:293 [inline] do_set_mempolicy+0x421/0xb70 mm/mempolicy.c:853 kernel_set_mempolicy mm/mempolicy.c:1504 [inline] __do_sys_set_mempolicy mm/mempolicy.c:1510 [inline] __se_sys_set_mempolicy+0x44c/0xb60 mm/mempolicy.c:1507 __x64_sys_set_mempolicy+0xd8/0x110 mm/mempolicy.c:1507 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x44/0xae KMSAN: uninit-value in mpol_rebind_task (2) https://syzkaller.appspot.com/bug?id=d6eb90f952c2a5de9ea718a1b873c55cb13b59dc This patch seems to fix below bug too. KMSAN: uninit-value in mpol_rebind_mm (2) https://syzkaller.appspot.com/bug?id=f2fecd0d7013f54ec4162f60743a2b28df40926b The uninit-value is pol->w.cpuset_mems_allowed in mpol_rebind_policy(). When syzkaller reproducer runs to the beginning of mpol_new(), mpol_new() mm/mempolicy.c do_mbind() mm/mempolicy.c kernel_mbind() mm/mempolicy.c `mode` is 1(MPOL_PREFERRED), nodes_empty(*nodes) is `true` and `flags` is 0. Then mode = MPOL_LOCAL; ... policy->mode = mode; policy->flags = flags; will be executed. So in mpol_set_nodemask(), mpol_set_nodemask() mm/mempolicy.c do_mbind() kernel_mbind() pol->mode is 4 (MPOL_LOCAL), that `nodemask` in `pol` is not initialized, which will be accessed in mpol_rebind_policy(). Link: https://lkml.kernel.org/r/20220512123428.fq3wofedp6oiotd4@ppc.localdomain Signed-off-by: Wang Cheng <wanngchenng@gmail.com> Reported-by: <syzbot+217f792c92599518a2ab@syzkaller.appspotmail.com> Tested-by: <syzbot+217f792c92599518a2ab@syzkaller.appspotmail.com> Cc: David Rientjes <rientjes@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>