summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-03-13s390/scm: use new address translation helpersHeiko Carstens
Use virt_to_dma64() and friends to properly convert virtual to physical and physical to virtual addresses so that "make C=1" does not generate any warnings anymore. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-13s390/cio: use new address translation helpersHeiko Carstens
Use virt_to_dma32() and friends to properly convert virtual to physical and physical to virtual addresses so that "make C=1" does not generate any warnings anymore. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-13s390/cio,idal: fix virtual vs physical address confusionHeiko Carstens
Fix virtual vs physical address confusion. This does not fix a bug since virtual and physical address spaces are currently the same. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-13s390/cio,idal: remove superfluous virt_to_phys() conversionHeiko Carstens
Only the last 12 bits of virtual / physical addresses are used when masking with IDA_BLOCK_SIZE - 1. Given that the bits are the same regardless of virtual or physical address, remove the virtual to physical address conversion. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-13s390/cio,idal: code cleanupHeiko Carstens
Adjust coding style, partially refactor code, and use kcalloc() instead of kmalloc() to allocate an idaw array. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-13s390/dasd: use new address translation helpersHeiko Carstens
Use virt_to_dma32() and friends to properly convert virtual to physical and physical to virtual addresses so that "make C=1" does not generate any warnings anymore. Reviewed-by: Stefan Haberland <sth@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-13s390/dasd: remove superfluous virt_to_phys() conversionHeiko Carstens
Only the last 12 bits of virtual / physical addresses are used when masking with IDA_BLOCK_SIZE - 1. Given that the bits are the same regardless of virtual or physical address, remove the virtual to physical address conversion. Reviewed-by: Stefan Haberland <sth@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-13s390/virtio_ccw: avoid converting dma addresses / handlesHalil Pasic
Instead of converting virtual to physical addresses with the virt_to_dma*() functions, use dma addresses as provided by DMA API and only add offsets to these addresses. This makes sure that address conversion is only done by the DMA API. Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-13s390/virtio_ccw: use DMA handle from DMA APIHalil Pasic
Change and use ccw_device_dma_zalloc() so it returns a virtual address like before, which can be used to access data. However also pass a new dma32_t pointer type handle, which correlates to the returned virtual address. This pointer is used to directly pass/set the DMA handle as returned by the DMA API. Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-13s390/virtio_ccw: fix virtual vs physical address confusionHalil Pasic
Fix virtual vs physical address confusion and use new dma types and helper functions to allow for type checking. This does not fix a bug since virtual and physical address spaces are currently the same. Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-13s390/cio: use bitwise types to allow for type checkingHeiko Carstens
Change types of I/O structure members which contain physical addresses to dma32_t and dma64_t bitwise types. This allows to make use of sparse (aka "make C=1") to find incorrect usage of physical addresses. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-13s390/cio: introduce bitwise dma types and helper functionsHalil Pasic
Introduce dma32_t and dma64_t bitwise types, which are supposed to be used for 31 and 64 bit DMA capable addresses. This allows to use sparse (make C=1) for type checking, so that incorrect usages can be easily found. Also add a couple of helper functions which - convert virtual to DMA addresses and vice versa - allow for simple logical and arithmetic operations on DMA addresses - convert DMA addresses to plain u32 and u64 values All helper functions exist to avoid excessive casting in C code. Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Co-developed-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-13s390/vfio_ccw: fix virtual vs physical address confusionHeiko Carstens
Fix virtual vs physical address confusion. This does not fix a bug since virtual and physical address spaces are currently the same. Reviewed-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-13s390/cio: fix virtual vs physical address confusionHeiko Carstens
Fix virtual vs physical address confusion. This does not fix a bug since virtual and physical address spaces are currently the same. Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-13s390/dasd_eckd: fix virtual vs physical address confusionHeiko Carstens
Fix virtual vs physical address confusion. This does not fix a bug since virtual and physical address spaces are currently the same. Reviewed-by: Stefan Haberland <sth@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-13s390/dcssblk: fix virtual vs physical address confusionGerald Schaefer
Fix virtual vs physical address confusion. This does not fix a bug since virtual and physical address spaces are currently the same. dax_direct_access() should receive a virtual kernel address in kaddr. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-13s390/iucv: fix receive buffer virtual vs physical address confusionAlexander Gordeev
Fix IUCV_IPBUFLST-type buffers virtual vs physical address confusion. This does not fix a bug since virtual and physical address spaces are currently the same. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-13s390/zcrypt: fix reference counting on zcrypt card objectsHarald Freudenberger
Tests with hot-plugging crytpo cards on KVM guests with debug kernel build revealed an use after free for the load field of the struct zcrypt_card. The reason was an incorrect reference handling of the zcrypt card object which could lead to a free of the zcrypt card object while it was still in use. This is an example of the slab message: kernel: 0x00000000885a7512-0x00000000885a7513 @offset=1298. First byte 0x68 instead of 0x6b kernel: Allocated in zcrypt_card_alloc+0x36/0x70 [zcrypt] age=18046 cpu=3 pid=43 kernel: kmalloc_trace+0x3f2/0x470 kernel: zcrypt_card_alloc+0x36/0x70 [zcrypt] kernel: zcrypt_cex4_card_probe+0x26/0x380 [zcrypt_cex4] kernel: ap_device_probe+0x15c/0x290 kernel: really_probe+0xd2/0x468 kernel: driver_probe_device+0x40/0xf0 kernel: __device_attach_driver+0xc0/0x140 kernel: bus_for_each_drv+0x8c/0xd0 kernel: __device_attach+0x114/0x198 kernel: bus_probe_device+0xb4/0xc8 kernel: device_add+0x4d2/0x6e0 kernel: ap_scan_adapter+0x3d0/0x7c0 kernel: ap_scan_bus+0x5a/0x3b0 kernel: ap_scan_bus_wq_callback+0x40/0x60 kernel: process_one_work+0x26e/0x620 kernel: worker_thread+0x21c/0x440 kernel: Freed in zcrypt_card_put+0x54/0x80 [zcrypt] age=9024 cpu=3 pid=43 kernel: kfree+0x37e/0x418 kernel: zcrypt_card_put+0x54/0x80 [zcrypt] kernel: ap_device_remove+0x4c/0xe0 kernel: device_release_driver_internal+0x1c4/0x270 kernel: bus_remove_device+0x100/0x188 kernel: device_del+0x164/0x3c0 kernel: device_unregister+0x30/0x90 kernel: ap_scan_adapter+0xc8/0x7c0 kernel: ap_scan_bus+0x5a/0x3b0 kernel: ap_scan_bus_wq_callback+0x40/0x60 kernel: process_one_work+0x26e/0x620 kernel: worker_thread+0x21c/0x440 kernel: kthread+0x150/0x168 kernel: __ret_from_fork+0x3c/0x58 kernel: ret_from_fork+0xa/0x30 kernel: Slab 0x00000372022169c0 objects=20 used=18 fp=0x00000000885a7c88 flags=0x3ffff00000000a00(workingset|slab|node=0|zone=1|lastcpupid=0x1ffff) kernel: Object 0x00000000885a74b8 @offset=1208 fp=0x00000000885a7c88 kernel: Redzone 00000000885a74b0: bb bb bb bb bb bb bb bb ........ kernel: Object 00000000885a74b8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk kernel: Object 00000000885a74c8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk kernel: Object 00000000885a74d8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk kernel: Object 00000000885a74e8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk kernel: Object 00000000885a74f8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk kernel: Object 00000000885a7508: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 68 4b 6b 6b 6b a5 kkkkkkkkkkhKkkk. kernel: Redzone 00000000885a7518: bb bb bb bb bb bb bb bb ........ kernel: Padding 00000000885a756c: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZ kernel: CPU: 0 PID: 387 Comm: systemd-udevd Not tainted 6.8.0-HF #2 kernel: Hardware name: IBM 3931 A01 704 (KVM/Linux) kernel: Call Trace: kernel: [<00000000ca5ab5b8>] dump_stack_lvl+0x90/0x120 kernel: [<00000000c99d78bc>] check_bytes_and_report+0x114/0x140 kernel: [<00000000c99d53cc>] check_object+0x334/0x3f8 kernel: [<00000000c99d820c>] alloc_debug_processing+0xc4/0x1f8 kernel: [<00000000c99d852e>] get_partial_node.part.0+0x1ee/0x3e0 kernel: [<00000000c99d94ec>] ___slab_alloc+0xaf4/0x13c8 kernel: [<00000000c99d9e38>] __slab_alloc.constprop.0+0x78/0xb8 kernel: [<00000000c99dc8dc>] __kmalloc+0x434/0x590 kernel: [<00000000c9b4c0ce>] ext4_htree_store_dirent+0x4e/0x1c0 kernel: [<00000000c9b908a2>] htree_dirblock_to_tree+0x17a/0x3f0 kernel: [<00000000c9b919dc>] ext4_htree_fill_tree+0x134/0x400 kernel: [<00000000c9b4b3d0>] ext4_dx_readdir+0x160/0x2f0 kernel: [<00000000c9b4bedc>] ext4_readdir+0x5f4/0x760 kernel: [<00000000c9a7efc4>] iterate_dir+0xb4/0x280 kernel: [<00000000c9a7f1ea>] __do_sys_getdents64+0x5a/0x120 kernel: [<00000000ca5d6946>] __do_syscall+0x256/0x310 kernel: [<00000000ca5eea10>] system_call+0x70/0x98 kernel: INFO: lockdep is turned off. kernel: FIX kmalloc-96: Restoring Poison 0x00000000885a7512-0x00000000885a7513=0x6b kernel: FIX kmalloc-96: Marking all objects used The fix is simple: Before use of the queue not only the queue object but also the card object needs to increase it's reference count with a call to zcrypt_card_get(). Similar after use of the queue not only the queue but also the card object's reference count is decreased with zcrypt_card_put(). Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-13s390/vtime: fix average steal time calculationMete Durlu
Current average steal timer calculation produces volatile and inflated values. The only user of this value is KVM so far and it uses that to decide whether or not to yield the vCPU which is seeing steal time. KVM compares average steal timer to a threshold and if the threshold is past then it does not allow CPU polling and yields it to host, else it keeps the CPU by polling. Since KVM's steal time threshold is very low by default (%10) it most likely is not effected much by the bloated average steal timer values because the operating region is pretty small. However there might be new users in the future who might rely on this number. Fix average steal timer calculation by changing the formula from: avg_steal_timer = avg_steal_timer / 2 + steal_timer; to the following: avg_steal_timer = (avg_steal_timer + steal_timer) / 2; This ensures that avg_steal_timer is actually a naive average of steal timer values. It now closely follows steal timer values but of course in a smoother manner. Fixes: 152e9b8676c6 ("s390/vtime: steal time exponential moving average") Signed-off-by: Mete Durlu <meted@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-13s390/sysinfo: allow response buffer in normal memoryAlexander Gordeev
As provided with commit cd4386a931b63 ("s390/cpcmd,vmcp: avoid GFP_DMA allocations") the Diagnose Code 8 response buffer does not have to be below 2GB. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-13octeontx2-af: Use matching wake_up API variant in CGX command interfaceLinu Cherian
Use wake_up API instead of wake_up_interruptible, since wait_event_timeout API is used for waiting on command completion. Fixes: 1463f382f58d ("octeontx2-af: Add support for CGX link management") Signed-off-by: Linu Cherian <lcherian@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-13soc: fsl: qbman: Use raw spinlock for cgr_lockSean Anderson
smp_call_function always runs its callback in hard IRQ context, even on PREEMPT_RT, where spinlocks can sleep. So we need to use a raw spinlock for cgr_lock to ensure we aren't waiting on a sleeping task. Although this bug has existed for a while, it was not apparent until commit ef2a8d5478b9 ("net: dpaa: Adjust queue depth on rate change") which invokes smp_call_function_single via qman_update_cgr_safe every time a link goes up or down. Fixes: 96f413f47677 ("soc/fsl/qbman: fix issue in qman_delete_cgr_safe()") CC: stable@vger.kernel.org Reported-by: Vladimir Oltean <vladimir.oltean@nxp.com> Closes: https://lore.kernel.org/all/20230323153935.nofnjucqjqnz34ej@skbuf/ Reported-by: Steffen Trumtrar <s.trumtrar@pengutronix.de> Closes: https://lore.kernel.org/linux-arm-kernel/87wmsyvclu.fsf@pengutronix.de/ Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Reviewed-by: Camelia Groza <camelia.groza@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-13soc: fsl: qbman: Always disable interrupts when taking cgr_lockSean Anderson
smp_call_function_single disables IRQs when executing the callback. To prevent deadlocks, we must disable IRQs when taking cgr_lock elsewhere. This is already done by qman_update_cgr and qman_delete_cgr; fix the other lockers. Fixes: 96f413f47677 ("soc/fsl/qbman: fix issue in qman_delete_cgr_safe()") CC: stable@vger.kernel.org Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Reviewed-by: Camelia Groza <camelia.groza@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-13ALSA: hda/tas2781: remove unnecessary runtime_pm callsPierre-Louis Bossart
The runtime_pm handling seems to have been loosely inspired by the cs32l41 driver, but in this case the get_noresume/put sequence is not required. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Message-ID: <20240312161217.79510-1-pierre-louis.bossart@linux.intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-03-13ALSA: hda/realtek - ALC236 fix volume mute & mic mute LED on some HP modelsValentine Altair
Some HP laptops have received revisions that altered their board IDs and therefore the current patches/quirks do not apply to them. Specifically, for my Probook 440 G8, I have a board ID of 8a74. It is necessary to add a line for that specific model. Signed-off-by: Valentine Altair <faetalize@proton.me> Cc: <stable@vger.kernel.org> Message-ID: <kOqXRBcxkKt6m5kciSDCkGqMORZi_HB3ZVPTX5sD3W1pKxt83Pf-WiQ1V1pgKKI8pYr4oGvsujt3vk2zsCE-DDtnUADFG6NGBlS5N3U4xgA=@proton.me> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-03-13Merge branch 'for-6.9/cxl-fixes' into for-6.9/cxlDan Williams
Pick up a parsing fix for the CDAT SSLBIS structure for v6.9.
2024-03-13Merge branch 'for-6.9/cxl-einj' into for-6.9/cxlDan Williams
Pick up support for injecting errors via ACPI EINJ into the CXL protocol for v6.9.
2024-03-13Merge branch 'for-6.9/cxl-qos' into for-6.9/cxlDan Williams
Pick up support for CXL "HMEM reporting" for v6.9, i.e. build an HMAT from CXL CDAT and PCIe switch information.
2024-03-13lib/firmware_table: Provide buffer length argument to cdat_table_parse()Robert Richter
There exist card implementations with a CDAT table using a fixed size buffer, but with entries filled in that do not fill the whole table length size. Then, the last entry in the CDAT table may not mark the end of the CDAT table buffer specified by the length field in the CDAT header. It can be shorter with trailing unused (zero'ed) data. The actual table length is determined while reading all CDAT entries of the table with DOE. If the table is greater than expected (containing zero'ed trailing data), the CDAT parser fails with: [ 48.691717] Malformed DSMAS table length: (24:0) [ 48.702084] [CDAT:0x00] Invalid zero length [ 48.711460] cxl_port endpoint1: Failed to parse CDAT: -22 In addition, a check of the table buffer length is missing to prevent an out-of-bound access then parsing the CDAT table. Hardening code against device returning borked table. Fix that by providing an optional buffer length argument to acpi_parse_entries_array() that can be used by cdat_table_parse() to propagate the buffer size down to its users to check the buffer length. This also prevents a possible out-of-bound access mentioned. Add a check to warn about a malformed CDAT table length. Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Len Brown <lenb@kernel.org> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/ZdEnopFO0Tl3t2O1@rric.localdomain Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-03-12cxl/pci: Get rid of pointer arithmetic reading CDAT tableRobert Richter
Reading the CDAT table using DOE requires a Table Access Response Header in addition to the CDAT entry. In current implementation this has caused offsets with sizeof(__le32) to the actual buffers. This led to hardly readable code and even bugs. E.g., see fix of devm_kfree() in read_cdat_data(): commit c65efe3685f5 ("cxl/cdat: Free correct buffer on checksum error") Rework code to avoid calculations with sizeof(__le32). Introduce struct cdat_doe_rsp for this which contains the Table Access Response Header and a variable payload size for various data structures afterwards to access the CDAT table and its CDAT Data Structures without recalculating buffer offsets. Cc: Lukas Wunner <lukas@wunner.de> Cc: Fan Ni <nifan.cxl@gmail.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240216155844.406996-3-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-03-12cxl/pci: Rename DOE mailbox handle to doe_mbRobert Richter
Trivial variable rename for the DOE mailbox handle from cdat_doe to doe_mb. The variable name cdat_doe is too ambiguous, use doe_mb that is commonly used for the mailbox. Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240216155844.406996-2-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-03-12cxl: Fix the incorrect assignment of SSLBIS entry pointer initial locationDave Jiang
The 'entry' pointer in cdat_sslbis_handler() is set to header + sizeof(common header). However, the math missed the addition of the SSLBIS main header. It should be header + sizeof(common header) + sizeof(*sslbis). Use a defined struct for all the SSLBIS parts in order to avoid pointer math errors. The bug causes incorrect parsing of the SSLBIS table and introduces incorrect performance values to the access_coordinates during the CXL access_coordinate calculation path if there are CXL switches present in the topology. The issue was found during testing of new code being added to add additional checks for invalid CDAT values during CXL access_coordinate calculation. The testing was done on qemu with a CXL topology including a CXL switch. Fixes: 80aa780dda20 ("cxl: Add callback to parse the SSLBIS subtable from CDAT") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Link: https://lore.kernel.org/r/20240301210948.1298075-1-dave.jiang@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-03-12cxl/core: Add CXL EINJ debugfs filesBen Cheatham
Export CXL helper functions in einj-cxl.c for getting/injecting available CXL protocol error types to sysfs under kernel/debug/cxl. The kernel/debug/cxl/einj_types file will print the available CXL protocol errors in the same format as the available_error_types file provided by the einj module. The kernel/debug/cxl/$dport_dev/einj_inject file is functionally the same as the error_type and error_inject files provided by the EINJ module, i.e.: writing an error type into $dport_dev/einj_inject will inject said error type into the CXL dport represented by $dport_dev. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> Link: https://lore.kernel.org/r/20240311142508.31717-4-Benjamin.Cheatham@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-03-12EINJ, Documentation: Update EINJ kernel docBen Cheatham
Update EINJ kernel document to include how to inject CXL protocol error types, build the kernel to include CXL error types, and give an example injection. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> Link: https://lore.kernel.org/r/20240311142508.31717-5-Benjamin.Cheatham@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-03-12EINJ: Add CXL error type supportBen Cheatham
Move CXL protocol error types from einj.c (now einj-core.c) to einj-cxl.c. einj-cxl.c implements the necessary handling for CXL protocol error injection and exposes an API for the CXL core to use said functionality, while also allowing the EINJ module to be built without CXL support. Because CXL error types targeting CXL 1.0/1.1 ports require special handling, only allow them to be injected through the new cxl debugfs interface (next commit) and return an error when attempting to inject through the legacy interface. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> Link: https://lore.kernel.org/r/20240311142508.31717-3-Benjamin.Cheatham@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-03-12EINJ: Migrate to a platform driverBen Cheatham
Change the EINJ module to install a platform device/driver on module init and move the module init() and exit() functions to driver probe and remove. This change allows the EINJ module to load regardless of whether setting up EINJ succeeds, which allows dependent modules to still load (i.e. the CXL core). Since EINJ may no longer be initialized when the module loads, any functions that are called from dependent/external modules should safegaurd against the case EINJ didn't load. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@amd.com> Link: https://lore.kernel.org/r/20240311142508.31717-2-Benjamin.Cheatham@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-03-12Merge tag 'printk-for-6.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk updates from Petr Mladek: "Improve the behavior during panic. The issues were found when testing the ongoing changes introducing atomic consoles and printk kthreads: - pr_flush() has to wait for the last reserved record instead of the last finalized one. Note that records are finalized in random order when generated by more CPUs in parallel. - Ignore non-finalized records during panic(). Messages printed on panic-CPU are always finalized. Messages printed by other CPUs might never be finalized when the CPUs get stopped. - Block new printk() calls on non-panic CPUs completely. Backtraces are printed before entering the panic mode. Later messages would just mess information printed by the panic CPU. - Do not take console_lock in console_flush_on_panic() at all. The original code did try_lock()/console_unlock(). The unlock part might cause a deadlock when panic() happened in a scheduler code. - Fix conversion of 64-bit sequence number for 32-bit atomic operations" * tag 'printk-for-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: dump_stack: Do not get cpu_sync for panic CPU panic: Flush kernel log buffer at the end printk: Avoid non-panic CPUs writing to ringbuffer printk: Disable passing console lock owner completely during panic() printk: ringbuffer: Skip non-finalized records in panic printk: Wait for all reserved records with pr_flush() printk: ringbuffer: Cleanup reader terminology printk: Add this_cpu_in_panic() printk: For @suppress_panic_printk check for other CPU in panic printk: ringbuffer: Clarify special lpos values printk: ringbuffer: Do not skip non-finalized records with prb_next_seq() printk: Use prb_first_seq() as base for 32bit seq macros printk: Adjust mapping for 32bit seq macros printk: nbcon: Relocate 32bit seq macros
2024-03-12mm, slab: remove last vestiges of SLAB_MEM_SPREADLinus Torvalds
Yes, yes, I know the slab people were planning on going slow and letting every subsystem fight this thing on their own. But let's just rip off the band-aid and get it over and done with. I don't want to see a number of unnecessary pull requests just to get rid of a flag that no longer has any meaning. This was mainly done with a couple of 'sed' scripts and then some manual cleanup of the end result. Link: https://lore.kernel.org/all/CAHk-=wji0u+OOtmAOD-5JV3SXcRJF___k_+8XNKmak0yd5vW1Q@mail.gmail.com/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-03-12Merge tag 'slab-for-6.9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab updates from Vlastimil Babka: - Freelist loading optimization (Chengming Zhou) When the per-cpu slab is depleted and a new one loaded from the cpu partial list, optimize the loading to avoid an irq enable/disable cycle. This results in a 3.5% performance improvement on the "perf bench sched messaging" test. - Kernel boot parameters cleanup after SLAB removal (Xiongwei Song) Due to two different main slab implementations we've had boot parameters prefixed either slab_ and slub_ with some later becoming an alias as both implementations gained the same functionality (i.e. slab_nomerge vs slub_nomerge). In order to eventually get rid of the implementation-specific names, the canonical and documented parameters are now all prefixed slab_ and the slub_ variants become deprecated but still working aliases. - SLAB_ kmem_cache creation flags cleanup (Vlastimil Babka) The flags had hardcoded #define values which became tedious and error-prone when adding new ones. Assign the values via an enum that takes care of providing unique bit numbers. Also deprecate SLAB_MEM_SPREAD which was only used by SLAB, so it's a no-op since SLAB removal. Assign it an explicit zero value. The removals of the flag usage are handled independently in the respective subsystems, with a final removal of any leftover usage planned for the next release. - Misc cleanups and fixes (Chengming Zhou, Xiaolei Wang, Zheng Yejian) Includes removal of unused code or function parameters and a fix of a memleak. * tag 'slab-for-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: slab: remove PARTIAL_NODE slab_state mm, slab: remove memcg_from_slab_obj() mm, slab: remove the corner case of inc_slabs_node() mm/slab: Fix a kmemleak in kmem_cache_destroy() mm, slab, kasan: replace kasan_never_merge() with SLAB_NO_MERGE mm, slab: use an enum to define SLAB_ cache creation flags mm, slab: deprecate SLAB_MEM_SPREAD flag mm, slab: fix the comment of cpu partial list mm, slab: remove unused object_size parameter in kmem_cache_flags() mm/slub: remove parameter 'flags' in create_kmalloc_caches() mm/slub: remove unused parameter in next_freelist_entry() mm/slub: remove full list manipulation for non-debug slab mm/slub: directly load freelist from cpu partial slab in the likely case mm/slub: make the description of slab_min_objects helpful in doc mm/slub: replace slub_$params with slab_$params in slub.rst mm/slub: unify all sl[au]b parameters with "slab_$param" Documentation: kernel-parameters: remove noaliencache
2024-03-12Merge tag 'lsm-pr-20240312' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull lsm updates from Paul Moore: - Promote IMA/EVM to a proper LSM This is the bulk of the diffstat, and the source of all the changes in the VFS code. Prior to the start of the LSM stacking work it was important that IMA/EVM were separate from the rest of the LSMs, complete with their own hooks, infrastructure, etc. as it was the only way to enable IMA/EVM at the same time as a LSM. However, now that the bulk of the LSM infrastructure supports multiple simultaneous LSMs, we can simplify things greatly by bringing IMA/EVM into the LSM infrastructure as proper LSMs. This is something I've wanted to see happen for quite some time and Roberto was kind enough to put in the work to make it happen. - Use the LSM hook default values to simplify the call_int_hook() macro Previously the call_int_hook() macro required callers to supply a default return value, despite a default value being specified when the LSM hook was defined. This simplifies the macro by using the defined default return value which makes life easier for callers and should also reduce the number of return value bugs in the future (we've had a few pop up recently, hence this work). - Use the KMEM_CACHE() macro instead of kmem_cache_create() The guidance appears to be to use the KMEM_CACHE() macro when possible and there is no reason why we can't use the macro, so let's use it. - Fix a number of comment typos in the LSM hook comment blocks Not much to say here, we fixed some questionable grammar decisions in the LSM hook comment blocks. * tag 'lsm-pr-20240312' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: (28 commits) cred: Use KMEM_CACHE() instead of kmem_cache_create() lsm: use default hook return value in call_int_hook() lsm: fix typos in security/security.c comment headers integrity: Remove LSM ima: Make it independent from 'integrity' LSM evm: Make it independent from 'integrity' LSM evm: Move to LSM infrastructure ima: Move IMA-Appraisal to LSM infrastructure ima: Move to LSM infrastructure integrity: Move integrity_kernel_module_request() to IMA security: Introduce key_post_create_or_update hook security: Introduce inode_post_remove_acl hook security: Introduce inode_post_set_acl hook security: Introduce inode_post_create_tmpfile hook security: Introduce path_post_mknod hook security: Introduce file_release hook security: Introduce file_post_open hook security: Introduce inode_post_removexattr hook security: Introduce inode_post_setattr hook security: Align inode_setattr hook definition with EVM ...
2024-03-12Merge tag 'selinux-pr-20240312' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: "Really only a few notable changes: - Continue the coding style/formatting fixup work This is the bulk of the diffstat in this pull request, with the focus this time around being the security/selinux/ss directory. We've only got a couple of files left to cleanup and once we're done with that we can start enabling some automatic style verfication and introduce tooling to help new folks format their code correctly. - Don't restrict xattr copy-up when SELinux policy is not loaded This helps systems that use overlayfs, or similar filesystems, preserve their SELinux labels during early boot when the SELinux policy has yet to be loaded. - Reduce the work we do during inode initialization time This isn't likely to show up in any benchmark results, but we removed an unnecessary SELinux object class lookup/calculation during inode initialization. - Correct the return values in selinux_socket_getpeersec_dgram() We had some inconsistencies with respect to our return values across selinux_socket_getpeersec_dgram() and selinux_socket_getpeersec_stream(). This provides a more uniform set of error codes across the two functions and should help make it easier for users to identify the source of a failure" * tag 'selinux-pr-20240312' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: (24 commits) selinux: fix style issues in security/selinux/ss/symtab.c selinux: fix style issues in security/selinux/ss/symtab.h selinux: fix style issues in security/selinux/ss/sidtab.c selinux: fix style issues in security/selinux/ss/sidtab.h selinux: fix style issues in security/selinux/ss/services.h selinux: fix style issues in security/selinux/ss/policydb.c selinux: fix style issues in security/selinux/ss/policydb.h selinux: fix style issues in security/selinux/ss/mls_types.h selinux: fix style issues in security/selinux/ss/mls.c selinux: fix style issues in security/selinux/ss/mls.h selinux: fix style issues in security/selinux/ss/hashtab.c selinux: fix style issues in security/selinux/ss/hashtab.h selinux: fix style issues in security/selinux/ss/ebitmap.c selinux: fix style issues in security/selinux/ss/ebitmap.h selinux: fix style issues in security/selinux/ss/context.h selinux: fix style issues in security/selinux/ss/context.h selinux: fix style issues in security/selinux/ss/constraint.h selinux: fix style issues in security/selinux/ss/conditional.c selinux: fix style issues in security/selinux/ss/conditional.h selinux: fix style issues in security/selinux/ss/avtab.c ...
2024-03-12Merge branch 'tcp-rds-fix-use-after-free-around-kernel-tcp-reqsk'Jakub Kicinski
Kuniyuki Iwashima says: ==================== tcp/rds: Fix use-after-free around kernel TCP reqsk. syzkaller reported an warning of netns ref tracker for RDS TCP listener, which commit 740ea3c4a0b2 ("tcp: Clean up kernel listener's reqsk in inet_twsk_purge()") fixed for per-netns ehash. This series fixes the bug in the partial fix and fixes the reported bug in the global ehash. v4: https://lore.kernel.org/netdev/20240307232151.55963-1-kuniyu@amazon.com/ v3: https://lore.kernel.org/netdev/20240307224423.53315-1-kuniyu@amazon.com/ v2: https://lore.kernel.org/netdev/20240227011041.97375-1-kuniyu@amazon.com/ v1: https://lore.kernel.org/netdev/20240223172448.94084-1-kuniyu@amazon.com/ ==================== Link: https://lore.kernel.org/r/20240308200122.64357-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-12rds: tcp: Fix use-after-free of net in reqsk_timer_handler().Kuniyuki Iwashima
syzkaller reported a warning of netns tracker [0] followed by KASAN splat [1] and another ref tracker warning [1]. syzkaller could not find a repro, but in the log, the only suspicious sequence was as follows: 18:26:22 executing program 1: r0 = socket$inet6_mptcp(0xa, 0x1, 0x106) ... connect$inet6(r0, &(0x7f0000000080)={0xa, 0x4001, 0x0, @loopback}, 0x1c) (async) The notable thing here is 0x4001 in connect(), which is RDS_TCP_PORT. So, the scenario would be: 1. unshare(CLONE_NEWNET) creates a per netns tcp listener in rds_tcp_listen_init(). 2. syz-executor connect()s to it and creates a reqsk. 3. syz-executor exit()s immediately. 4. netns is dismantled. [0] 5. reqsk timer is fired, and UAF happens while freeing reqsk. [1] 6. listener is freed after RCU grace period. [2] Basically, reqsk assumes that the listener guarantees netns safety until all reqsk timers are expired by holding the listener's refcount. However, this was not the case for kernel sockets. Commit 740ea3c4a0b2 ("tcp: Clean up kernel listener's reqsk in inet_twsk_purge()") fixed this issue only for per-netns ehash. Let's apply the same fix for the global ehash. [0]: ref_tracker: net notrefcnt@0000000065449cc3 has 1/1 users at sk_alloc (./include/net/net_namespace.h:337 net/core/sock.c:2146) inet6_create (net/ipv6/af_inet6.c:192 net/ipv6/af_inet6.c:119) __sock_create (net/socket.c:1572) rds_tcp_listen_init (net/rds/tcp_listen.c:279) rds_tcp_init_net (net/rds/tcp.c:577) ops_init (net/core/net_namespace.c:137) setup_net (net/core/net_namespace.c:340) copy_net_ns (net/core/net_namespace.c:497) create_new_namespaces (kernel/nsproxy.c:110) unshare_nsproxy_namespaces (kernel/nsproxy.c:228 (discriminator 4)) ksys_unshare (kernel/fork.c:3429) __x64_sys_unshare (kernel/fork.c:3496) do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129) ... WARNING: CPU: 0 PID: 27 at lib/ref_tracker.c:179 ref_tracker_dir_exit (lib/ref_tracker.c:179) [1]: BUG: KASAN: slab-use-after-free in inet_csk_reqsk_queue_drop (./include/net/inet_hashtables.h:180 net/ipv4/inet_connection_sock.c:952 net/ipv4/inet_connection_sock.c:966) Read of size 8 at addr ffff88801b370400 by task swapper/0/0 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Call Trace: <IRQ> dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1)) print_report (mm/kasan/report.c:378 mm/kasan/report.c:488) kasan_report (mm/kasan/report.c:603) inet_csk_reqsk_queue_drop (./include/net/inet_hashtables.h:180 net/ipv4/inet_connection_sock.c:952 net/ipv4/inet_connection_sock.c:966) reqsk_timer_handler (net/ipv4/inet_connection_sock.c:979 net/ipv4/inet_connection_sock.c:1092) call_timer_fn (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/timer.h:127 kernel/time/timer.c:1701) __run_timers.part.0 (kernel/time/timer.c:1752 kernel/time/timer.c:2038) run_timer_softirq (kernel/time/timer.c:2053) __do_softirq (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/irq.h:142 kernel/softirq.c:554) irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632 kernel/softirq.c:644) sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1076 (discriminator 14)) </IRQ> Allocated by task 258 on cpu 0 at 83.612050s: kasan_save_stack (mm/kasan/common.c:48) kasan_save_track (mm/kasan/common.c:68) __kasan_slab_alloc (mm/kasan/common.c:343) kmem_cache_alloc (mm/slub.c:3813 mm/slub.c:3860 mm/slub.c:3867) copy_net_ns (./include/linux/slab.h:701 net/core/net_namespace.c:421 net/core/net_namespace.c:480) create_new_namespaces (kernel/nsproxy.c:110) unshare_nsproxy_namespaces (kernel/nsproxy.c:228 (discriminator 4)) ksys_unshare (kernel/fork.c:3429) __x64_sys_unshare (kernel/fork.c:3496) do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129) Freed by task 27 on cpu 0 at 329.158864s: kasan_save_stack (mm/kasan/common.c:48) kasan_save_track (mm/kasan/common.c:68) kasan_save_free_info (mm/kasan/generic.c:643) __kasan_slab_free (mm/kasan/common.c:265) kmem_cache_free (mm/slub.c:4299 mm/slub.c:4363) cleanup_net (net/core/net_namespace.c:456 net/core/net_namespace.c:446 net/core/net_namespace.c:639) process_one_work (kernel/workqueue.c:2638) worker_thread (kernel/workqueue.c:2700 kernel/workqueue.c:2787) kthread (kernel/kthread.c:388) ret_from_fork (arch/x86/kernel/process.c:153) ret_from_fork_asm (arch/x86/entry/entry_64.S:250) The buggy address belongs to the object at ffff88801b370000 which belongs to the cache net_namespace of size 4352 The buggy address is located 1024 bytes inside of freed 4352-byte region [ffff88801b370000, ffff88801b371100) [2]: WARNING: CPU: 0 PID: 95 at lib/ref_tracker.c:228 ref_tracker_free (lib/ref_tracker.c:228 (discriminator 1)) Modules linked in: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:ref_tracker_free (lib/ref_tracker.c:228 (discriminator 1)) ... Call Trace: <IRQ> __sk_destruct (./include/net/net_namespace.h:353 net/core/sock.c:2204) rcu_core (./arch/x86/include/asm/preempt.h:26 kernel/rcu/tree.c:2165 kernel/rcu/tree.c:2433) __do_softirq (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/irq.h:142 kernel/softirq.c:554) irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632 kernel/softirq.c:644) sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1076 (discriminator 14)) </IRQ> Reported-by: syzkaller <syzkaller@googlegroups.com> Suggested-by: Eric Dumazet <edumazet@google.com> Fixes: 467fa15356ac ("RDS-TCP: Support multiple RDS-TCP listen endpoints, one per netns.") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240308200122.64357-3-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-12tcp: Fix NEW_SYN_RECV handling in inet_twsk_purge()Eric Dumazet
inet_twsk_purge() uses rcu to find TIME_WAIT and NEW_SYN_RECV objects to purge. These objects use SLAB_TYPESAFE_BY_RCU semantic and need special care. We need to use refcount_inc_not_zero(&sk->sk_refcnt). Reuse the existing correct logic I wrote for TIME_WAIT, because both structures have common locations for sk_state, sk_family, and netns pointer. If after the refcount_inc_not_zero() the object fields longer match the keys, use sock_gen_put(sk) to release the refcount. Then we can call inet_twsk_deschedule_put() for TIME_WAIT, inet_csk_reqsk_queue_drop_and_put() for NEW_SYN_RECV sockets, with BH disabled. Then we need to restart the loop because we had drop rcu_read_lock(). Fixes: 740ea3c4a0b2 ("tcp: Clean up kernel listener's reqsk in inet_twsk_purge()") Link: https://lore.kernel.org/netdev/CANn89iLvFuuihCtt9PME2uS1WJATnf5fKjDToa1WzVnRzHnPfg@mail.gmail.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240308200122.64357-2-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-13Revert "crypto: remove CONFIG_CRYPTO_STATS"Herbert Xu
This reverts commit 2beb81fbf0c01a62515a1bcef326168494ee2bd0. While removing CONFIG_CRYPTO_STATS is a worthy goal, this also removed unrelated infrastructure such as crypto_comp_alg_common. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2024-03-12f2fs: prevent atomic write on pinned fileDaeho Jeong
Since atomic write way was changed to out-place-update, we should prevent it on pinned files. Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-12f2fs: fix to handle error paths of {new,change}_curseg()Zhiguo Niu
{new,change}_curseg() may return error in some special cases, error handling should be did in their callers, and this will also facilitate subsequent error path expansion in {new,change}_curseg(). Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Signed-off-by: Chao Yu <chao@kernel.org> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-12f2fs: unify the error handling of f2fs_is_valid_blkaddrZhiguo Niu
There are some cases of f2fs_is_valid_blkaddr not handled as ERROR_INVALID_BLKADDR,so unify the error handling about all of f2fs_is_valid_blkaddr. Do f2fs_handle_error in __f2fs_is_valid_blkaddr for cleanup. Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Signed-off-by: Chao Yu <chao@kernel.org> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-12f2fs: zone: fix to remove pow2 check condition for zoned block deviceChao Yu
Commit 2e2c6e9b72ce ("f2fs: remove power-of-two limitation of zoned device") missed to remove pow2 check condition in init_blkz_info(), fix it. Fixes: 2e2c6e9b72ce ("f2fs: remove power-of-two limitation of zoned device") Signed-off-by: Feng Song <songfeng@oppo.com> Signed-off-by: Yongpeng Yang <yangyongpeng1@oppo.com> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-12f2fs: fix to truncate meta inode pages forcelyChao Yu
Below race case can cause data corruption: Thread A GC thread - gc_data_segment - ra_data_block - locked meta_inode page - f2fs_inplace_write_data - invalidate_mapping_pages : fail to invalidate meta_inode page due to lock failure or dirty|writeback status - f2fs_submit_page_bio : write last dirty data to old blkaddr - move_data_block - load old data from meta_inode page - f2fs_submit_page_write : write old data to new blkaddr Because invalidate_mapping_pages() will skip invalidating page which has unclear status including locked, dirty, writeback and so on, so we need to use truncate_inode_pages_range() instead of invalidate_mapping_pages() to make sure meta_inode page will be dropped. Fixes: 6aa58d8ad20a ("f2fs: readahead encrypted block during GC") Fixes: e3b49ea36802 ("f2fs: invalidate META_MAPPING before IPU/DIO write") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>