summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-09-03drm/i915/display: Increase Fast Wake Sync length as a quirkJouni Högander
In commit "drm/i915/display: Increase number of fast wake precharge pulses" we were increasing Fast Wake sync pulse length to fix problems observed on Dell Precision 5490 laptop with AUO panel. Later we have observed this is causing problems on other panels. Fix these problems by increasing Fast Wake sync pulse length as a quirk applied for Dell Precision 5490 with problematic panel. Fixes: f77772866385 ("drm/i915/display: Increase number of fast wake precharge pulses") Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Closes: http://gitlab.freedesktop.org/drm/i915/kernel/-/issues/9739 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2246 Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11762 Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Cc: <stable@vger.kernel.org> # v6.10+ Link: https://patchwork.freedesktop.org/patch/msgid/20240902064241.1020965-3-jouni.hogander@intel.com (cherry picked from commit fcba2ed66b39252210f4e739722ebcc5398c2197) Requires: 43cf50eb1408 ("drm/i915/display: Add mechanism to use sink model when applying quirk") Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2024-09-03drm/i915/display: Add mechanism to use sink model when applying quirkJouni Högander
Currently there is no way to apply quirk on device only if certain panel model is installed. This patch implements such mechanism by adding new quirk type intel_dpcd_quirk which contains also sink_oui and sink_device_id fields and using also them to figure out if applying quirk is needed. New intel_init_dpcd_quirks is added and called after drm_dp_read_desc with proper sink device identity read from dpcdc. v3: - !mem_is_zero fixed to mem_is_zero v2: - instead of using struct intel_quirk add new struct intel_dpcd_quirk Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240902064241.1020965-2-jouni.hogander@intel.com (cherry picked from commit b3b91369908ac63be6f64905448b8ba5cd151875) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2024-09-02Merge tag 'for-net-2024-08-30' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - qca: If memdump doesn't work, re-enable IBS - MGMT: Fix not generating command complete for MGMT_OP_DISCONNECT - Revert "Bluetooth: MGMT/SMP: Fix address type when using SMP over BREDR/LE" - MGMT: Ignore keys being loaded with invalid type * tag 'for-net-2024-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: MGMT: Ignore keys being loaded with invalid type Revert "Bluetooth: MGMT/SMP: Fix address type when using SMP over BREDR/LE" Bluetooth: MGMT: Fix not generating command complete for MGMT_OP_DISCONNECT Bluetooth: hci_sync: Introduce hci_cmd_sync_run/hci_cmd_sync_run_once Bluetooth: qca: If memdump doesn't work, re-enable IBS ==================== Link: https://patch.msgid.link/20240830220300.1316772-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-02Merge tag 'linux-can-fixes-for-6.11-20240830' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2024-08-30 The first patch is by Kuniyuki Iwashima for the CAN BCM protocol that adds a missing proc entry removal when a device unregistered. Simon Horman fixes the cleanup in the error cleanup path of the m_can driver's open function. Markus Schneider-Pargmann contributes 7 fixes for the m_can driver, all related to the recently added IRQ coalescing support. The next 2 patches are by me, target the mcp251xfd driver and fix ring and coalescing configuration problems when switching from CAN-CC to CAN-FD mode. Simon Arlott's patch fixes a possible deadlock in the mcp251x driver. The last patch is by Martin Jocic for the kvaser_pciefd driver and fixes a problem with lost IRQs, which result in starvation, under high load situations. * tag 'linux-can-fixes-for-6.11-20240830' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: kvaser_pciefd: Use a single write when releasing RX buffers can: mcp251x: fix deadlock if an interrupt occurs during mcp251x_open can: mcp251xfd: mcp251xfd_ring_init(): check TX-coalescing configuration can: mcp251xfd: fix ring configuration when switching from CAN-CC to CAN-FD mode can: m_can: Limit coalescing to peripheral instances can: m_can: Reset cached active_interrupts on start can: m_can: disable_all_interrupts, not clear active_interrupts can: m_can: Do not cancel timer from within timer can: m_can: Remove m_can_rx_peripheral indirection can: m_can: Remove coalesing disable in isr during suspend can: m_can: Reset coalescing during suspend/resume can: m_can: Release irq on error in m_can_open can: bcm: Remove proc entry when dev is unregistered. ==================== Link: https://patch.msgid.link/20240830215914.1610393-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-02smb: client: fix hang in wait_for_response() for negprotoPaulo Alcantara
Call cifs_reconnect() to wake up processes waiting on negotiate protocol to handle the case where server abruptly shut down and had no chance to properly close the socket. Simple reproducer: ssh 192.168.2.100 pkill -STOP smbd mount.cifs //192.168.2.100/test /mnt -o ... [never returns] Cc: Rickard Andersson <rickaran@axis.com> Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-09-02btrfs: zoned: handle broken write pointer on zonesNaohiro Aota
Btrfs rejects to mount a FS if it finds a block group with a broken write pointer (e.g, unequal write pointers on two zones of RAID1 block group). Since such case can happen easily with a power-loss or crash of a system, we need to handle the case more gently. Handle such block group by making it unallocatable, so that there will be no writes into it. That can be done by setting the allocation pointer at the end of allocating region (= block_group->zone_capacity). Then, existing code handle zone_unusable properly. Having proper zone_capacity is necessary for the change. So, set it as fast as possible. We cannot handle RAID0 and RAID10 case like this. But, they are anyway unable to read because of a missing stripe. Fixes: 265f7237dd25 ("btrfs: zoned: allow DUP on meta-data block groups") Fixes: 568220fa9657 ("btrfs: zoned: support RAID0/1/10 on top of raid stripe tree") CC: stable@vger.kernel.org # 6.1+ Reported-by: HAN Yuwei <hrx@bupt.moe> Cc: Xuefer <xuefer@gmail.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-02perf daemon: Fix the build on more 32-bit architecturesArnaldo Carvalho de Melo
FYI: I'm carrying this on perf-tools-next. The previous attempt fixed the build on debian:experimental-x-mipsel, but when building on a larger set of containers I noticed it broke the build on some other 32-bit architectures such as: 42 7.87 ubuntu:18.04-x-arm : FAIL gcc version 7.5.0 (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) builtin-daemon.c: In function 'cmd_session_list': builtin-daemon.c:692:16: error: format '%llu' expects argument of type 'long long unsigned int', but argument 4 has type 'long int' [-Werror=format=] fprintf(out, "%c%" PRIu64, ^~~~~ builtin-daemon.c:694:13: csv_sep, (curr - daemon->start) / 60); ~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from builtin-daemon.c:3:0: /usr/arm-linux-gnueabihf/include/inttypes.h:105:34: note: format string is defined here # define PRIu64 __PRI64_PREFIX "u" So lets cast that time_t (32-bit/64-bit) to uint64_t to make sure it builds everywhere. Fixes: 4bbe6002931954bb ("perf daemon: Fix the build on 32-bit architectures") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/ZsPmldtJ0D9Cua9_@x1 Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-09-02perf python: include "util/sample.h"Xu Yang
The 32-bit arm build system will complain: tools/perf/util/python.c:75:28: error: field ‘sample’ has incomplete type 75 | struct perf_sample sample; However, arm64 build system doesn't complain this. The root cause is arm64 define "HAVE_KVM_STAT_SUPPORT := 1" in tools/perf/arch/arm64/Makefile, but arm arch doesn't define this. This will lead to kvm-stat.h include other header files on arm64 build system, especially "util/sample.h" for util/python.c. This will try to directly include "util/sample.h" for "util/python.c" to avoid such build issue on arm platform. Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Cc: imx@lists.linux.dev Link: https://lore.kernel.org/r/20240819023403.201324-1-xu.yang_2@nxp.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-09-02perf lock contention: Fix spinlock and rwlock accountingNamhyung Kim
The spinlock and rwlock use a single-element per-cpu array to track current locks due to performance reason. But this means the key is always available and it cannot simply account lock stats in the array because some of them are invalid. In fact, the contention_end() program in the BPF invalidates the entry by setting the 'lock' value to 0 instead of deleting the entry for the hashmap. So it should skip entries with the lock value of 0 in the account_end_timestamp(). Otherwise, it'd have spurious high contention on an idle machine: $ sudo perf lock con -ab -Y spinlock sleep 3 contended total wait max wait avg wait type caller 8 4.72 s 1.84 s 590.46 ms spinlock rcu_core+0xc7 8 1.87 s 1.87 s 233.48 ms spinlock process_one_work+0x1b5 2 1.87 s 1.87 s 933.92 ms spinlock worker_thread+0x1a2 3 1.81 s 1.81 s 603.93 ms spinlock tmigr_update_events+0x13c 2 1.72 s 1.72 s 861.98 ms spinlock tick_do_update_jiffies64+0x25 6 42.48 us 13.02 us 7.08 us spinlock futex_q_lock+0x2a 1 13.03 us 13.03 us 13.03 us spinlock futex_wake+0xce 1 11.61 us 11.61 us 11.61 us spinlock rcu_core+0xc7 I don't believe it has contention on a spinlock longer than 1 second. After this change, it only reports some small contentions. $ sudo perf lock con -ab -Y spinlock sleep 3 contended total wait max wait avg wait type caller 4 133.51 us 43.29 us 33.38 us spinlock tick_do_update_jiffies64+0x25 4 69.06 us 31.82 us 17.27 us spinlock process_one_work+0x1b5 2 50.66 us 25.77 us 25.33 us spinlock rcu_core+0xc7 1 28.45 us 28.45 us 28.45 us spinlock rcu_core+0xc7 1 24.77 us 24.77 us 24.77 us spinlock tmigr_update_events+0x13c 1 23.34 us 23.34 us 23.34 us spinlock raw_spin_rq_lock_nested+0x15 Fixes: b5711042a1c8 ("perf lock contention: Use per-cpu array map for spinlocks") Reported-by: Xi Wang <xii@google.com> Cc: Song Liu <song@kernel.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20240828052953.1445862-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-09-02perf test pmu: Set uninitialized PMU alias to nullVeronika Molnarova
Commit 3e0bf9fde2984469 ("perf pmu: Restore full PMU name wildcard support") adds a test case "PMU cmdline match" that covers PMU name wildcard support provided by function perf_pmu__match(). The test works with a wide range of supported combinations of PMU name matching but omits the case that if the perf_pmu__match() cannot match the PMU name to the wildcard, it tries to match its alias. However, this variable is not set up, causing the test case to fail when run with subprocesses or to segfault if run as a single process. ./perf test -vv 9 9: Sysfs PMU tests : 9.1: Parsing with PMU format directory : Ok 9.2: Parsing with PMU event : Ok 9.3: PMU event names : Ok 9.4: PMU name combining : Ok 9.5: PMU name comparison : Ok 9.6: PMU cmdline match : FAILED! ./perf test -F 9 9.1: Parsing with PMU format directory : Ok 9.2: Parsing with PMU event : Ok 9.3: PMU event names : Ok 9.4: PMU name combining : Ok 9.5: PMU name comparison : Ok Segmentation fault (core dumped) Initialize the PMU alias to null for all tests of perf_pmu__match() as this functionality is not being tested and the alias matching works exactly the same as the matching of the PMU name. ./perf test -F 9 9.1: Parsing with PMU format directory : Ok 9.2: Parsing with PMU event : Ok 9.3: PMU event names : Ok 9.4: PMU name combining : Ok 9.5: PMU name comparison : Ok 9.6: PMU cmdline match : Ok Fixes: 3e0bf9fde2984469 ("perf pmu: Restore full PMU name wildcard support") Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com> Cc: james.clark@arm.com Cc: mpetlan@redhat.com Cc: rstoyano@redhat.com Link: https://lore.kernel.org/r/20240808103749.9356-1-vmolnaro@redhat.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-09-02btrfs: qgroup: don't use extent changeset when not neededFedor Pchelkin
The local extent changeset is passed to clear_record_extent_bits() where it may have some additional memory dynamically allocated for ulist. When qgroup is disabled, the memory is leaked because in this case the changeset is not released upon __btrfs_qgroup_release_data() return. Since the recorded contents of the changeset are not used thereafter, just don't pass it. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Reported-by: syzbot+81670362c283f3dd889c@syzkaller.appspotmail.com Closes: https://lore.kernel.org/lkml/000000000000aa8c0c060ade165e@google.com Fixes: af0e2aab3b70 ("btrfs: qgroup: flush reservations during quota disable") CC: stable@vger.kernel.org # 6.10+ Reviewed-by: Boris Burkov <boris@bur.io> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Signed-off-by: David Sterba <dsterba@suse.com>
2024-09-02drm/amd/display: Block timing sync for different signals in PMODillon Varone
PMO assumes that like timings can be synchronized, but DC only allows this if the signal types match. Reviewed-by: Austin Zheng <austin.zheng@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 29d3d6af43135de7bec677f334292ca8dab53d67) Cc: stable@vger.kernel.org
2024-09-02drm/amd/display: Lock DC and exit IPS when changing backlightLeo Li
Backlight updates require aux and/or register access. Therefore, driver needs to disallow IPS beforehand. So, acquire the dc lock before calling into dc to update backlight - we should be doing this regardless of IPS. Then, while the lock is held, disallow IPS before calling into dc, then allow IPS afterwards (if it was previously allowed). Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Roman Li <roman.li@amd.com> Signed-off-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 988fe2862635c1b1b40e41c85c24db44ab337c13) Cc: stable@vger.kernel.org # 6.10+
2024-09-02drm/amdgpu: always allocate cleared VRAM for GEM allocationsAlex Deucher
This adds allocation latency, but aligns better with user expectations. The latency should improve with the drm buddy clearing patches that Arun has been working on. In addition this fixes the high CPU spikes seen when doing wipe on release. v2: always set AMDGPU_GEM_CREATE_VRAM_CLEARED (Christian) Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3528 Fixes: a68c7eaa7a8f ("drm/amdgpu: Enable clear page functionality") Acked-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> (v1) Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Cc: Christian König <christian.koenig@amd.com> (cherry picked from commit 6c0a7c3c693ac84f8b50269a9088af8f37446863) Cc: stable@vger.kernel.org # 6.10.x
2024-09-02drm/amdgpu/mes: add mes mapping legacy queue switchJack Xiao
For mes11 old firmware has issue to map legacy queue, add a flag to switch mes to map legacy queue. Fixes: f9d8c5c7855d ("drm/amdgpu/gfx: enable mes to map legacy queue support") Reported-by: Andrew Worsley <amworsley@gmail.com> Link: https://lists.freedesktop.org/archives/amd-gfx/2024-August/112773.html Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 52491d97aadcde543986d596ed55f70bf2142851)
2024-09-02drm/amd/display: Determine IPS mode by ASIC and PMFW versionsLeo Li
[Why] DCN IPS interoperates with other system idle power features, such as Zstates. On DCN35, there is a known issue where system Z8 + DCN IPS2 causes a hard hang. We observe this on systems where the SBIOS allows Z8. Though there is a SBIOS fix, there's no guarantee that users will get it any time soon, or even install it. A workaround is needed to prevent this from rearing its head in the wild. [How] For DCN35, check the pmfw version to determine whether the SBIOS has the fix. If not, set IPS1+RCG as the deepest possible state in all cases except for s0ix and display off (DPMS). Otherwise, enable all IPS Signed-off-by: Leo Li <sunpeng.li@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 28d43d0895896f84c038d906d244e0a95eb243ec) Cc: stable@vger.kernel.org
2024-09-02Revert "wifi: ath11k: support hibernation"Baochen Qiang
This reverts commit 166a490f59ac10340ee5330e51c15188ce2a7f8f. There are several reports that this commit breaks system suspend on some specific Lenovo platforms. Since there is no fix available, for now revert this commit to make suspend work again on those platforms. Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219196 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2301921 Cc: <stable@vger.kernel.org> # 6.10.x: d3e154d7776b: Revert "wifi: ath11k: restore country code during resume" Cc: <stable@vger.kernel.org> # 6.10.x Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://patch.msgid.link/20240830073420.5790-3-quic_bqiang@quicinc.com
2024-09-02Revert "wifi: ath11k: restore country code during resume"Baochen Qiang
This reverts commit 7f0343b7b8710436c1e6355c71782d32ada47e0c. We are going to revert commit 166a490f59ac ("wifi: ath11k: support hibernation"), on which this commit depends. With that commit reverted, this one is not needed any more, so revert this commit first. Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://patch.msgid.link/20240830073420.5790-2-quic_bqiang@quicinc.com
2024-09-02KVM: x86: Only advertise KVM_CAP_READONLY_MEM when supported by VMTom Dohrmann
Until recently, KVM_CAP_READONLY_MEM was unconditionally supported on x86, but this is no longer the case for SEV-ES and SEV-SNP VMs. When KVM_CHECK_EXTENSION is invoked on a VM, only advertise KVM_CAP_READONLY_MEM when it's actually supported. Fixes: 66155de93bcf ("KVM: x86: Disallow read-only memslots for SEV-ES and SEV-SNP (and TDX)") Cc: Sean Christopherson <seanjc@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Michael Roth <michael.roth@amd.com> Signed-off-by: Tom Dohrmann <erbse.13@gmx.de> Message-ID: <20240902144219.3716974-1-erbse.13@gmx.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-09-02Merge tag 'kvm-x86-fixes-6.11-rcN' of https://github.com/kvm-x86/linux into ↵Paolo Bonzini
kvm-master KVM x86 fixes for 6.11 - Fixup missed comments from the REMOVED_SPTE=>FROZEN_SPTE rename. - Ensure a root is successfully loaded when pre-faulting SPTEs. - Grab kvm->srcu when handling KVM_SET_VCPU_EVENTS to guard against accessing memslots if toggling SMM happens to force a VM-Exit. - Emulate MSR_{FS,GS}_BASE on SVM even though interception is always disabled, so that KVM does the right thing if KVM's emulator encounters {RD,WR}MSR. - Explicitly clear BUS_LOCK_DETECT from KVM's caps on AMD, as KVM doesn't yet virtualize BUS_LOCK_DETECT on AMD. - Cleanup the help message for CONFIG_KVM_AMD_SEV, and call out that KVM now supports SEV-SNP too.
2024-09-02hwmon: (hp-wmi-sensors) Check if WMI event data existsArmin Wolf
The BIOS can choose to return no event data in response to a WMI event, so the ACPI object passed to the WMI notify handler can be NULL. Check for such a situation and ignore the event in such a case. Fixes: 23902f98f8d4 ("hwmon: add HP WMI Sensors driver") Signed-off-by: Armin Wolf <W_Armin@gmx.de> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Message-ID: <20240901031055.3030-2-W_Armin@gmx.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-09-02gpio: modepin: Enable module autoloadingLiao Chen
Add MODULE_DEVICE_TABLE(), so modules could be properly autoloaded based on the alias from of_device_id table. Fixes: 7687a5b0ee93 ("gpio: modepin: Add driver support for modepin GPIO controller") Signed-off-by: Liao Chen <liaochen4@huawei.com> Reviewed-by: Michal Simek <michal.simek@amd.com> Link: https://lore.kernel.org/r/20240902115848.904227-1-liaochen4@huawei.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-09-02igc: Unlock on error in igc_io_resume()Dan Carpenter
Call rtnl_unlock() on this error path, before returning. Fixes: bc23aa949aeb ("igc: Add pcie error handler support") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-02gpio: rockchip: fix OF node leak in probe()Krzysztof Kozlowski
Driver code is leaking OF node reference from of_get_parent() in probe(). Fixes: 936ee2675eee ("gpio/rockchip: add driver for rockchip gpio") Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com> Link: https://lore.kernel.org/r/20240826150832.65657-1-krzysztof.kozlowski@linaro.org Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-09-02spi: bcm63xx: Enable module autoloadingLiao Chen
Add MODULE_DEVICE_TABLE(), so modules could be properly autoloaded based on the alias from of_device_id table. Signed-off-by: Liao Chen <liaochen4@huawei.com> Link: https://patch.msgid.link/20240831094231.795024-1-liaochen4@huawei.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-09-02drm/i915/fence: Mark debug_fence_free() with __maybe_unusedAndy Shevchenko
When debug_fence_free() is unused (CONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS=n), it prevents kernel builds with clang, `make W=1` and CONFIG_WERROR=y: .../i915_sw_fence.c:118:20: error: unused function 'debug_fence_free' [-Werror,-Wunused-function] 118 | static inline void debug_fence_free(struct i915_sw_fence *fence) | ^~~~~~~~~~~~~~~~ Fix this by marking debug_fence_free() with __maybe_unused. See also commit 6863f5643dd7 ("kbuild: allow Clang to find unused static inline functions for W=1 build"). Fixes: fc1584059d6c ("drm/i915: Integrate i915_sw_fence with debugobjects") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240829155950.1141978-3-andriy.shevchenko@linux.intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit 8be4dce5ea6f2368cc25edc71989c4690fa66964) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2024-09-02drm/i915/fence: Mark debug_fence_init_onstack() with __maybe_unusedAndy Shevchenko
When debug_fence_init_onstack() is unused (CONFIG_DRM_I915_SELFTEST=n), it prevents kernel builds with clang, `make W=1` and CONFIG_WERROR=y: .../i915_sw_fence.c:97:20: error: unused function 'debug_fence_init_onstack' [-Werror,-Wunused-function] 97 | static inline void debug_fence_init_onstack(struct i915_sw_fence *fence) | ^~~~~~~~~~~~~~~~~~~~~~~~ Fix this by marking debug_fence_init_onstack() with __maybe_unused. See also commit 6863f5643dd7 ("kbuild: allow Clang to find unused static inline functions for W=1 build"). Fixes: 214707fc2ce0 ("drm/i915/selftests: Wrap a timer into a i915_sw_fence") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240829155950.1141978-2-andriy.shevchenko@linux.intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit 5bf472058ffb43baf6a4cdfe1d7f58c4c194c688) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2024-09-02drm/i915: Fix readout degamma_lut mismatch on ilk/snbVille Syrjälä
On ilk/snb the pipe may be configured to place the LUT before or after the CSC depending on various factors, but as there is only one LUT (no split mode like on IVB+) we only advertise a gamma_lut and no degamma_lut in the uapi to avoid confusing userspace. This can cause a problem during readout if the VBIOS/GOP enabled the LUT in the pre CSC configuration. The current code blindly assigns the results of the readout to the degamma_lut, which will cause a failure during the next atomic_check() as we aren't expecting anything to be in degamma_lut since it's not visible to userspace. Fix the problem by assigning whatever LUT we read out from the hardware into gamma_lut. Cc: stable@vger.kernel.org Fixes: d2559299d339 ("drm/i915: Make ilk_read_luts() capable of degamma readout") Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11608 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240710124137.16773-1-ville.syrjala@linux.intel.com Reviewed-by: Uma Shankar <uma.shankar@intel.com> (cherry picked from commit 33eca84db6e31091cef63584158ab64704f78462) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2024-09-02drm/i915: Do not attempt to load the GSC multiple timesDaniele Ceraolo Spurio
If the GSC FW fails to load the GSC HW hangs permanently; the only ways to recover it are FLR or D3cold entry, with the former only being supported on driver unload and the latter only on DGFX, for which we don't need to load the GSC. Therefore, if GSC fails to load there is no need to try again because the HW is stuck in the error state and the submission to load the FW would just hang the GSCCS. Note that, due to wa_14015076503, on MTL the GuC escalates all GSCCS hangs to full GT resets, which would trigger a new attempt to load the GSC FW in the post-reset HW re-init; this issue is also fixed by not attempting to load the GSC FW after an error. Fixes: 15bd4a67e914 ("drm/i915/gsc: GSC firmware loading") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: <stable@vger.kernel.org> # v6.3+ Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240820215952.2290807-1-daniele.ceraolospurio@intel.com (cherry picked from commit 03ded4d432a1fb7bb6c44c5856d14115f6f6c3b9) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2024-09-02Merge tag 'timers-v6.11-rc7' of ↵Thomas Gleixner
https://git.linaro.org/people/daniel.lezcano/linux into timers/urgent Pull clocksource driver fixes from Daniel Lezcano: - Remove percpu irq related code in the timer-of initialization routine as it is broken but also unused (Daniel Lezcano) - Fix return -ETIME when delta exceeds INT_MAX and the next event not taking effect sometimes (Jacky Bai) Link: https://lore.kernel.org/all/d0e93dbd-b796-4726-b38c-089b685591c9@linaro.org
2024-09-02net: microchip: vcap: Fix use-after-free error in kunit testJens Emil Schulz Østergaard
This is a clear use-after-free error. We remove it, and rely on checking the return code of vcap_del_rule. Reported-by: Dan Carpenter <error27@gmail.com> Closes: https://lore.kernel.org/kernel-janitors/7bffefc6-219a-4f71-baa0-ad4526e5c198@kili.mountain/ Fixes: c956b9b318d9 ("net: microchip: sparx5: Adding KUNIT tests of key/action values in VCAP API") Signed-off-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-02MAINTAINERS: Remove Wedson as Rust maintainerWedson Almeida Filho
I am retiring from the project, so removing myself from MAINTAINERS as I won't have time to dedicate to it. Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com> Link: https://lore.kernel.org/r/20240828211117.9422-2-wedsonaf@gmail.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-09-02pinctrl: qcom: x1e80100: Bypass PDC wakeup parent for nowStephan Gerhold
On X1E80100, GPIO interrupts for wakeup-capable pins have been broken since the introduction of the pinctrl driver. This prevents keyboard and touchpad from working on most of the X1E laptops. So far we have worked around this by manually building a kernel with the "wakeup-parent" removed from the pinctrl node in the device tree, but we cannot expect all users to do that. Implement a similar workaround in the driver by clearing the wakeirq_map for X1E80100. This avoids using the PDC wakeup parent for all GPIOs and handles the interrupts directly in the pinctrl driver instead. The PDC driver needs additional changes to support X1E80100 properly. Adding a workaround separately first allows to land the necessary PDC changes through the normal release cycle, while still solving the more critical problem with keyboard and touchpad on the current stable kernel versions. Bypassing the PDC is enough for now, because we have not yet enabled the deep idle states where using the PDC becomes necessary. Cc: stable@vger.kernel.org Fixes: 05e4941d97ef ("pinctrl: qcom: Add X1E80100 pinctrl driver") Signed-off-by: Stephan Gerhold <stephan.gerhold@linaro.org> Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Tested-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Konrad Dybcio <konradybcio@kernel.org> Reviewed-by: Abel Vesa <abel.vesa@linaro.org> Link: https://lore.kernel.org/20240830-x1e80100-bypass-pdc-v1-1-d4c00be0c3e3@linaro.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2024-09-02drm/imagination: Free pvr_vm_gpuva after unlinkMatt Coster
This caused a measurable memory leak. Although the individual allocations are small, the leaks occurs in a high-usage codepath (remapping or unmapping device memory) so they add up quickly. Fixes: ff5f643de0bf ("drm/imagination: Add GEM and VM related code") Cc: stable@vger.kernel.org Reviewed-by: Frank Binns <frank.binns@imgtec.com> Link: https://patchwork.freedesktop.org/patch/msgid/35867394-d8ce-4698-a8fd-919a018f1583@imgtec.com Signed-off-by: Matt Coster <matt.coster@imgtec.com>
2024-09-02clocksource/drivers/imx-tpm: Fix next event not taking effect sometimeJacky Bai
The value written into the TPM CnV can only be updated into the hardware when the counter increases. Additional writes to the CnV write buffer are ignored until the register has been updated. Therefore, we need to check if the CnV has been updated before continuing. This may require waiting for 1 counter cycle in the worst case. Cc: stable@vger.kernel.org Fixes: 059ab7b82eec ("clocksource/drivers/imx-tpm: Add imx tpm timer support") Signed-off-by: Jacky Bai <ping.bai@nxp.com> Reviewed-by: Peng Fan <peng.fan@nxp.com> Reviewed-by: Ye Li <ye.li@nxp.com> Reviewed-by: Jason Liu <jason.hui.liu@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20240725193355.1436005-2-Frank.Li@nxp.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-09-02clocksource/drivers/imx-tpm: Fix return -ETIME when delta exceeds INT_MAXJacky Bai
In tpm_set_next_event(delta), return -ETIME by wrong cast to int when delta is larger than INT_MAX. For example: tpm_set_next_event(delta = 0xffff_fffe) { ... next = tpm_read_counter(); // assume next is 0x10 next += delta; // next will 0xffff_fffe + 0x10 = 0x1_0000_000e now = tpm_read_counter(); // now is 0x10 ... return (int)(next - now) <= 0 ? -ETIME : 0; ^^^^^^^^^^ 0x1_0000_000e - 0x10 = 0xffff_fffe, which is -2 when cast to int. So return -ETIME. } To fix this, introduce a 'prev' variable and check if 'now - prev' is larger than delta. Cc: stable@vger.kernel.org Fixes: 059ab7b82eec ("clocksource/drivers/imx-tpm: Add imx tpm timer support") Signed-off-by: Jacky Bai <ping.bai@nxp.com> Reviewed-by: Peng Fan <peng.fan@nxp.com> Reviewed-by: Ye Li <ye.li@nxp.com> Reviewed-by: Jason Liu <jason.hui.liu@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20240725193355.1436005-1-Frank.Li@nxp.com Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-09-02clocksource/drivers/timer-of: Remove percpu irq related codeDaniel Lezcano
GCC's named address space checks errors out with: drivers/clocksource/timer-of.c: In function ‘timer_of_irq_exit’: drivers/clocksource/timer-of.c:29:46: error: passing argument 2 of ‘free_percpu_irq’ from pointer to non-enclosed address space 29 | free_percpu_irq(of_irq->irq, clkevt); | ^~~~~~ In file included from drivers/clocksource/timer-of.c:8: ./include/linux/interrupt.h:201:43: note: expected ‘__seg_gs void *’ but argument is of type ‘struct clock_event_device *’ 201 | extern void free_percpu_irq(unsigned int, void __percpu *); | ^~~~~~~~~~~~~~~ drivers/clocksource/timer-of.c: In function ‘timer_of_irq_init’: drivers/clocksource/timer-of.c:74:51: error: passing argument 4 of ‘request_percpu_irq’ from pointer to non-enclosed address space 74 | np->full_name, clkevt) : | ^~~~~~ ./include/linux/interrupt.h:190:56: note: expected ‘__seg_gs void *’ but argument is of type ‘struct clock_event_device *’ 190 | const char *devname, void __percpu *percpu_dev_id) Sparse warns about: timer-of.c:29:46: warning: incorrect type in argument 2 (different address spaces) timer-of.c:29:46: expected void [noderef] __percpu * timer-of.c:29:46: got struct clock_event_device *clkevt timer-of.c:74:51: warning: incorrect type in argument 4 (different address spaces) timer-of.c:74:51: expected void [noderef] __percpu *percpu_dev_id timer-of.c:74:51: got struct clock_event_device *clkevt It appears the code is incorrect as reported by Uros Bizjak: "The referred code is questionable as it tries to reuse the clkevent pointer once as percpu pointer and once as generic pointer, which should be avoided." This change removes the percpu related code as no drivers is using it. [Daniel: Fixed the description] Fixes: dc11bae785295 ("clocksource/drivers: Add timer-of common init routine") Reported-by: Uros Bizjak <ubizjak@gmail.com> Tested-by: Uros Bizjak <ubizjak@gmail.com> Link: https://lore.kernel.org/r/20240819100335.2394751-1-daniel.lezcano@linaro.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-09-02rust: macros: provide correct provenance when constructing THIS_MODULEBoqun Feng
Currently while defining `THIS_MODULE` symbol in `module!()`, the pointer used to construct `ThisModule` is derived from an immutable reference of `__this_module`, which means the pointer doesn't have the provenance for writing, and that means any write to that pointer is UB regardless of data races or not. However, the usage of `THIS_MODULE` includes passing this pointer to functions that may write to it (probably in unsafe code), and this will create soundness issues. One way to fix this is using `addr_of_mut!()` but that requires the unstable feature "const_mut_refs". So instead of `addr_of_mut()!`, an extern static `Opaque` is used here: since `Opaque<T>` is transparent to `T`, an extern static `Opaque` will just wrap the C symbol (defined in a C compile unit) in an `Opaque`, which provides a pointer with writable provenance via `Opaque::get()`. This fix the potential UBs because of pointer provenance unmatched. Reported-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Trevor Gross <tmgross@umich.edu> Reviewed-by: Benno Lossin <benno.lossin@proton.me> Reviewed-by: Gary Guo <gary@garyguo.net> Closes: https://rust-for-linux.zulipchat.com/#narrow/stream/x/topic/x/near/465412664 Fixes: 1fbde52bde73 ("rust: add `macros` crate") Cc: stable@vger.kernel.org # 6.6.x: be2ca1e03965: ("rust: types: Make Opaque::get const") Link: https://lore.kernel.org/r/20240828180129.4046355-1-boqun.feng@gmail.com [ Fixed two typos, reworded title. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-09-01Merge tag 'ata-6.11-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux Pull ata fix from Damien Le Moal: - Fix a potential memory leak in the ata host initialization code (from Zheng) * tag 'ata-6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux: ata: libata: Fix memory leak for error path in ata_host_alloc()
2024-09-01alloc_tag: fix allocation tag reporting when CONFIG_MODULES=nSuren Baghdasaryan
codetag_module_init() is used to initialize sections containing allocation tags. This function is used to initialize module sections as well as core kernel sections, in which case the module parameter is set to NULL. This function has to be called even when CONFIG_MODULES=n to initialize core kernel allocation tag sections. When CONFIG_MODULES=n, this function is a NOP, which is wrong. This leads to /proc/allocinfo reported as empty. Fix this by making it independent of CONFIG_MODULES. Link: https://lkml.kernel.org/r/20240828231536.1770519-1-surenb@google.com Fixes: 916cc5167cc6 ("lib: code tagging framework") Signed-off-by: Suren Baghdasaryan <surenb@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Sourav Panda <souravpanda@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> [6.10+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01mm: vmalloc: optimize vmap_lazy_nr arithmetic when purging each vmap_areaAdrian Huang
When running the vmalloc stress on a 448-core system, observe the average latency of purge_vmap_node() is about 2 seconds by using the eBPF/bcc 'funclatency.py' tool [1]. # /your-git-repo/bcc/tools/funclatency.py -u purge_vmap_node & pid1=$! && sleep 8 && modprobe test_vmalloc nr_threads=$(nproc) run_test_mask=0x7; kill -SIGINT $pid1 usecs : count distribution 0 -> 1 : 0 | | 2 -> 3 : 29 | | 4 -> 7 : 19 | | 8 -> 15 : 56 | | 16 -> 31 : 483 |**** | 32 -> 63 : 1548 |************ | 64 -> 127 : 2634 |********************* | 128 -> 255 : 2535 |********************* | 256 -> 511 : 1776 |************** | 512 -> 1023 : 1015 |******** | 1024 -> 2047 : 573 |**** | 2048 -> 4095 : 488 |**** | 4096 -> 8191 : 1091 |********* | 8192 -> 16383 : 3078 |************************* | 16384 -> 32767 : 4821 |****************************************| 32768 -> 65535 : 3318 |*************************** | 65536 -> 131071 : 1718 |************** | 131072 -> 262143 : 2220 |****************** | 262144 -> 524287 : 1147 |********* | 524288 -> 1048575 : 1179 |********* | 1048576 -> 2097151 : 822 |****** | 2097152 -> 4194303 : 906 |******* | 4194304 -> 8388607 : 2148 |***************** | 8388608 -> 16777215 : 4497 |************************************* | 16777216 -> 33554431 : 289 |** | avg = 2041714 usecs, total: 78381401772 usecs, count: 38390 The worst case is over 16-33 seconds, so soft lockup is triggered [2]. [Root Cause] 1) Each purge_list has the long list. The following shows the number of vmap_area is purged. crash> p vmap_nodes vmap_nodes = $27 = (struct vmap_node *) 0xff2de5a900100000 crash> vmap_node 0xff2de5a900100000 128 | grep nr_purged nr_purged = 663070 ... nr_purged = 821670 nr_purged = 692214 nr_purged = 726808 ... 2) atomic_long_sub() employs the 'lock' prefix to ensure the atomic operation when purging each vmap_area. However, the iteration is over 600000 vmap_area (See 'nr_purged' above). Here is objdump output: $ objdump -D vmlinux ffffffff813e8c80 <purge_vmap_node>: ... ffffffff813e8d70: f0 48 29 2d 68 0c bb lock sub %rbp,0x2bb0c68(%rip) ... Quote from "Instruction tables" pdf file [3]: Instructions with a LOCK prefix have a long latency that depends on cache organization and possibly RAM speed. If there are multiple processors or cores or direct memory access (DMA) devices, then all locked instructions will lock a cache line for exclusive access, which may involve RAM access. A LOCK prefix typically costs more than a hundred clock cycles, even on single-processor systems. That's why the latency of purge_vmap_node() dramatically increases on a many-core system: One core is busy on purging each vmap_area of the *long* purge_list and executing atomic_long_sub() for each vmap_area, while other cores free vmalloc allocations and execute atomic_long_add_return() in free_vmap_area_noflush(). [Solution] Employ a local variable to record the total purged pages, and execute atomic_long_sub() after the traversal of the purge_list is done. The experiment result shows the latency improvement is 99%. [Experiment Result] 1) System Configuration: Three servers (with HT-enabled) are tested. * 72-core server: 3rd Gen Intel Xeon Scalable Processor*1 * 192-core server: 5th Gen Intel Xeon Scalable Processor*2 * 448-core server: AMD Zen 4 Processor*2 2) Kernel Config * CONFIG_KASAN is disabled 3) The data in column "w/o patch" and "w/ patch" * Unit: micro seconds (us) * Each data is the average of 3-time measurements System w/o patch (us) w/ patch (us) Improvement (%) --------------- -------------- ------------- ------------- 72-core server 2194 14 99.36% 192-core server 143799 1139 99.21% 448-core server 1992122 6883 99.65% [1] https://github.com/iovisor/bcc/blob/master/tools/funclatency.py [2] https://gist.github.com/AdrianHuang/37c15f67b45407b83c2d32f918656c12 [3] https://www.agner.org/optimize/instruction_tables.pdf Link: https://lkml.kernel.org/r/20240829130633.2184-1-ahuang12@lenovo.com Signed-off-by: Adrian Huang <ahuang12@lenovo.com> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01mailmap: update entry for Jan KuligaJan Kuliga
Soon I won't be able to use my current email address. Link: https://lkml.kernel.org/r/20240830095658.1203198-1-jankul@alatek.krakow.pl Signed-off-by: Jan Kuliga <jankul@alatek.krakow.pl> Cc: David S. Miller <davem@davemloft.net> Cc: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01codetag: debug: mark codetags for poisoned page as emptyHao Ge
When PG_hwpoison pages are freed they are treated differently in free_pages_prepare() and instead of being released they are isolated. Page allocation tag counters are decremented at this point since the page is considered not in use. Later on when such pages are released by unpoison_memory(), the allocation tag counters will be decremented again and the following warning gets reported: [ 113.930443][ T3282] ------------[ cut here ]------------ [ 113.931105][ T3282] alloc_tag was not set [ 113.931576][ T3282] WARNING: CPU: 2 PID: 3282 at ./include/linux/alloc_tag.h:130 pgalloc_tag_sub.part.66+0x154/0x164 [ 113.932866][ T3282] Modules linked in: hwpoison_inject fuse ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute ip6table_nat ip6table_man4 [ 113.941638][ T3282] CPU: 2 UID: 0 PID: 3282 Comm: madvise11 Kdump: loaded Tainted: G W 6.11.0-rc4-dirty #18 [ 113.943003][ T3282] Tainted: [W]=WARN [ 113.943453][ T3282] Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022 [ 113.944378][ T3282] pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 113.945319][ T3282] pc : pgalloc_tag_sub.part.66+0x154/0x164 [ 113.946016][ T3282] lr : pgalloc_tag_sub.part.66+0x154/0x164 [ 113.946706][ T3282] sp : ffff800087093a10 [ 113.947197][ T3282] x29: ffff800087093a10 x28: ffff0000d7a9d400 x27: ffff80008249f0a0 [ 113.948165][ T3282] x26: 0000000000000000 x25: ffff80008249f2b0 x24: 0000000000000000 [ 113.949134][ T3282] x23: 0000000000000001 x22: 0000000000000001 x21: 0000000000000000 [ 113.950597][ T3282] x20: ffff0000c08fcad8 x19: ffff80008251e000 x18: ffffffffffffffff [ 113.952207][ T3282] x17: 0000000000000000 x16: 0000000000000000 x15: ffff800081746210 [ 113.953161][ T3282] x14: 0000000000000000 x13: 205d323832335420 x12: 5b5d353031313339 [ 113.954120][ T3282] x11: ffff800087093500 x10: 000000000000005d x9 : 00000000ffffffd0 [ 113.955078][ T3282] x8 : 7f7f7f7f7f7f7f7f x7 : ffff80008236ba90 x6 : c0000000ffff7fff [ 113.956036][ T3282] x5 : ffff000b34bf4dc8 x4 : ffff8000820aba90 x3 : 0000000000000001 [ 113.956994][ T3282] x2 : ffff800ab320f000 x1 : 841d1e35ac932e00 x0 : 0000000000000000 [ 113.957962][ T3282] Call trace: [ 113.958350][ T3282] pgalloc_tag_sub.part.66+0x154/0x164 [ 113.959000][ T3282] pgalloc_tag_sub+0x14/0x1c [ 113.959539][ T3282] free_unref_page+0xf4/0x4b8 [ 113.960096][ T3282] __folio_put+0xd4/0x120 [ 113.960614][ T3282] folio_put+0x24/0x50 [ 113.961103][ T3282] unpoison_memory+0x4f0/0x5b0 [ 113.961678][ T3282] hwpoison_unpoison+0x30/0x48 [hwpoison_inject] [ 113.962436][ T3282] simple_attr_write_xsigned.isra.34+0xec/0x1cc [ 113.963183][ T3282] simple_attr_write+0x38/0x48 [ 113.963750][ T3282] debugfs_attr_write+0x54/0x80 [ 113.964330][ T3282] full_proxy_write+0x68/0x98 [ 113.964880][ T3282] vfs_write+0xdc/0x4d0 [ 113.965372][ T3282] ksys_write+0x78/0x100 [ 113.965875][ T3282] __arm64_sys_write+0x24/0x30 [ 113.966440][ T3282] invoke_syscall+0x7c/0x104 [ 113.966984][ T3282] el0_svc_common.constprop.1+0x88/0x104 [ 113.967652][ T3282] do_el0_svc+0x2c/0x38 [ 113.968893][ T3282] el0_svc+0x3c/0x1b8 [ 113.969379][ T3282] el0t_64_sync_handler+0x98/0xbc [ 113.969980][ T3282] el0t_64_sync+0x19c/0x1a0 [ 113.970511][ T3282] ---[ end trace 0000000000000000 ]--- To fix this, clear the page tag reference after the page got isolated and accounted for. Link: https://lkml.kernel.org/r/20240825163649.33294-1-hao.ge@linux.dev Fixes: d224eb0287fb ("codetag: debug: mark codetags for reserved pages as empty") Signed-off-by: Hao Ge <gehao@kylinos.cn> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Acked-by: Suren Baghdasaryan <surenb@google.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hao Ge <gehao@kylinos.cn> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: <stable@vger.kernel.org> [6.10+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01mm/memcontrol: respect zswap.writeback setting from parent cg tooMike Yuan
Currently, the behavior of zswap.writeback wrt. the cgroup hierarchy seems a bit odd. Unlike zswap.max, it doesn't honor the value from parent cgroups. This surfaced when people tried to globally disable zswap writeback, i.e. reserve physical swap space only for hibernation [1] - disabling zswap.writeback only for the root cgroup results in subcgroups with zswap.writeback=1 still performing writeback. The inconsistency became more noticeable after I introduced the MemoryZSwapWriteback= systemd unit setting [2] for controlling the knob. The patch assumed that the kernel would enforce the value of parent cgroups. It could probably be workarounded from systemd's side, by going up the slice unit tree and inheriting the value. Yet I think it's more sensible to make it behave consistently with zswap.max and friends. [1] https://wiki.archlinux.org/title/Power_management/Suspend_and_hibernate#Disable_zswap_writeback_to_use_the_swap_space_only_for_hibernation [2] https://github.com/systemd/systemd/pull/31734 Link: https://lkml.kernel.org/r/20240823162506.12117-1-me@yhndnzj.com Fixes: 501a06fe8e4c ("zswap: memcontrol: implement zswap writeback disabling") Signed-off-by: Mike Yuan <me@yhndnzj.com> Reviewed-by: Nhat Pham <nphamcs@gmail.com> Acked-by: Yosry Ahmed <yosryahmed@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Koutný <mkoutny@suse.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01scripts: fix gfp-translate after ___GFP_*_BITS conversion to an enumMarc Zyngier
Richard reports that since 772dd0342727c ("mm: enumerate all gfp flags"), gfp-translate is broken, as the bit numbers are implicit, leaving the shell script unable to extract them. Even more, some bits are now at a variable location, making it double extra hard to parse using a simple shell script. Use a brute-force approach to the problem by generating a small C stub that will use the enum to dump the interesting bits. As an added bonus, we are now able to identify invalid bits for a given configuration. As an added drawback, we cannot parse include files that predate this change anymore. Tough luck. Link: https://lkml.kernel.org/r/20240823163850.3791201-1-maz@kernel.org Fixes: 772dd0342727 ("mm: enumerate all gfp flags") Signed-off-by: Marc Zyngier <maz@kernel.org> Reported-by: Richard Weinberger <richard@nod.at> Cc: Petr Tesařík <petr@tesarici.cz> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01Revert "mm: skip CMA pages when they are not available"Usama Arif
This reverts commit 5da226dbfce3 ("mm: skip CMA pages when they are not available") and b7108d66318a ("Multi-gen LRU: skip CMA pages when they are not eligible"). lruvec->lru_lock is highly contended and is held when calling isolate_lru_folios. If the lru has a large number of CMA folios consecutively, while the allocation type requested is not MIGRATE_MOVABLE, isolate_lru_folios can hold the lock for a very long time while it skips those. For FIO workload, ~150million order=0 folios were skipped to isolate a few ZONE_DMA folios [1]. This can cause lockups [1] and high memory pressure for extended periods of time [2]. Remove skipping CMA for MGLRU as well, as it was introduced in sort_folio for the same resaon as 5da226dbfce3a2f44978c2c7cf88166e69a6788b. [1] https://lore.kernel.org/all/CAOUHufbkhMZYz20aM_3rHZ3OcK4m2puji2FGpUpn_-DevGk3Kg@mail.gmail.com/ [2] https://lore.kernel.org/all/ZrssOrcJIDy8hacI@gmail.com/ [usamaarif642@gmail.com: also revert b7108d66318a, per Johannes] Link: https://lkml.kernel.org/r/9060a32d-b2d7-48c0-8626-1db535653c54@gmail.com Link: https://lkml.kernel.org/r/357ac325-4c61-497a-92a3-bdbd230d5ec9@gmail.com Link: https://lkml.kernel.org/r/9060a32d-b2d7-48c0-8626-1db535653c54@gmail.com Fixes: 5da226dbfce3 ("mm: skip CMA pages when they are not available") Signed-off-by: Usama Arif <usamaarif642@gmail.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Bharata B Rao <bharata@amd.com> Cc: Breno Leitao <leitao@debian.org> Cc: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Yu Zhao <yuzhao@google.com> Cc: Zhaoyang Huang <huangzhaoyang@gmail.com> Cc: Zhaoyang Huang <zhaoyang.huang@unisoc.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01maple_tree: remove rcu_read_lock() from mt_validate()Liam R. Howlett
The write lock should be held when validating the tree to avoid updates racing with checks. Holding the rcu read lock during a large tree validation may also cause a prolonged rcu read window and "rcu_preempt detected stalls" warnings. Link: https://lore.kernel.org/all/0000000000001d12d4062005aea1@google.com/ Link: https://lkml.kernel.org/r/20240820175417.2782532-1-Liam.Howlett@oracle.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Reported-by: syzbot+036af2f0c7338a33b0cd@syzkaller.appspotmail.com Cc: Hillf Danton <hdanton@sina.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01kexec_file: fix elfcorehdr digest exclusion when CONFIG_CRASH_HOTPLUG=yPetr Tesarik
Fix the condition to exclude the elfcorehdr segment from the SHA digest calculation. The j iterator is an index into the output sha_regions[] array, not into the input image->segment[] array. Once it reaches image->elfcorehdr_index, all subsequent segments are excluded. Besides, if the purgatory segment precedes the elfcorehdr segment, the elfcorehdr may be wrongly included in the calculation. Link: https://lkml.kernel.org/r/20240805150750.170739-1-petr.tesarik@suse.com Fixes: f7cc804a9fd4 ("kexec: exclude elfcorehdr from the segment digest") Signed-off-by: Petr Tesarik <ptesarik@suse.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Sourabh Jain <sourabhjain@linux.ibm.com> Cc: Eric DeVolder <eric_devolder@yahoo.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01mm/slub: add check for s->flags in the alloc_tagging_slab_free_hookHao Ge
When enable CONFIG_MEMCG & CONFIG_KFENCE & CONFIG_KMEMLEAK, the following warning always occurs,This is because the following call stack occurred: mem_pool_alloc kmem_cache_alloc_noprof slab_alloc_node kfence_alloc Once the kfence allocation is successful,slab->obj_exts will not be empty, because it has already been assigned a value in kfence_init_pool. Since in the prepare_slab_obj_exts_hook function,we perform a check for s->flags & (SLAB_NO_OBJ_EXT | SLAB_NOLEAKTRACE),the alloc_tag_add function will not be called as a result.Therefore,ref->ct remains NULL. However,when we call mem_pool_free,since obj_ext is not empty, it eventually leads to the alloc_tag_sub scenario being invoked. This is where the warning occurs. So we should add corresponding checks in the alloc_tagging_slab_free_hook. For __GFP_NO_OBJ_EXT case,I didn't see the specific case where it's using kfence,so I won't add the corresponding check in alloc_tagging_slab_free_hook for now. [ 3.734349] ------------[ cut here ]------------ [ 3.734807] alloc_tag was not set [ 3.735129] WARNING: CPU: 4 PID: 40 at ./include/linux/alloc_tag.h:130 kmem_cache_free+0x444/0x574 [ 3.735866] Modules linked in: autofs4 [ 3.736211] CPU: 4 UID: 0 PID: 40 Comm: ksoftirqd/4 Tainted: G W 6.11.0-rc3-dirty #1 [ 3.736969] Tainted: [W]=WARN [ 3.737258] Hardware name: QEMU KVM Virtual Machine, BIOS unknown 2/2/2022 [ 3.737875] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 3.738501] pc : kmem_cache_free+0x444/0x574 [ 3.738951] lr : kmem_cache_free+0x444/0x574 [ 3.739361] sp : ffff80008357bb60 [ 3.739693] x29: ffff80008357bb70 x28: 0000000000000000 x27: 0000000000000000 [ 3.740338] x26: ffff80008207f000 x25: ffff000b2eb2fd60 x24: ffff0000c0005700 [ 3.740982] x23: ffff8000804229e4 x22: ffff800082080000 x21: ffff800081756000 [ 3.741630] x20: fffffd7ff8253360 x19: 00000000000000a8 x18: ffffffffffffffff [ 3.742274] x17: ffff800ab327f000 x16: ffff800083398000 x15: ffff800081756df0 [ 3.742919] x14: 0000000000000000 x13: 205d344320202020 x12: 5b5d373038343337 [ 3.743560] x11: ffff80008357b650 x10: 000000000000005d x9 : 00000000ffffffd0 [ 3.744231] x8 : 7f7f7f7f7f7f7f7f x7 : ffff80008237bad0 x6 : c0000000ffff7fff [ 3.744907] x5 : ffff80008237ba78 x4 : ffff8000820bbad0 x3 : 0000000000000001 [ 3.745580] x2 : 68d66547c09f7800 x1 : 68d66547c09f7800 x0 : 0000000000000000 [ 3.746255] Call trace: [ 3.746530] kmem_cache_free+0x444/0x574 [ 3.746931] mem_pool_free+0x44/0xf4 [ 3.747306] free_object_rcu+0xc8/0xdc [ 3.747693] rcu_do_batch+0x234/0x8a4 [ 3.748075] rcu_core+0x230/0x3e4 [ 3.748424] rcu_core_si+0x14/0x1c [ 3.748780] handle_softirqs+0x134/0x378 [ 3.749189] run_ksoftirqd+0x70/0x9c [ 3.749560] smpboot_thread_fn+0x148/0x22c [ 3.749978] kthread+0x10c/0x118 [ 3.750323] ret_from_fork+0x10/0x20 [ 3.750696] ---[ end trace 0000000000000000 ]--- Link: https://lkml.kernel.org/r/20240816013336.17505-1-hao.ge@linux.dev Fixes: 4b8736964640 ("mm/slab: add allocation accounting into slab allocation and free paths") Signed-off-by: Hao Ge <gehao@kylinos.cn> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <kees@kernel.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Pekka Enberg <penberg@kernel.org> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01nilfs2: fix state management in error path of log writing functionRyusuke Konishi
After commit a694291a6211 ("nilfs2: separate wait function from nilfs_segctor_write") was applied, the log writing function nilfs_segctor_do_construct() was able to issue I/O requests continuously even if user data blocks were split into multiple logs across segments, but two potential flaws were introduced in its error handling. First, if nilfs_segctor_begin_construction() fails while creating the second or subsequent logs, the log writing function returns without calling nilfs_segctor_abort_construction(), so the writeback flag set on pages/folios will remain uncleared. This causes page cache operations to hang waiting for the writeback flag. For example, truncate_inode_pages_final(), which is called via nilfs_evict_inode() when an inode is evicted from memory, will hang. Second, the NILFS_I_COLLECTED flag set on normal inodes remain uncleared. As a result, if the next log write involves checkpoint creation, that's fine, but if a partial log write is performed that does not, inodes with NILFS_I_COLLECTED set are erroneously removed from the "sc_dirty_files" list, and their data and b-tree blocks may not be written to the device, corrupting the block mapping. Fix these issues by uniformly calling nilfs_segctor_abort_construction() on failure of each step in the loop in nilfs_segctor_do_construct(), having it clean up logs and segment usages according to progress, and correcting the conditions for calling nilfs_redirty_inodes() to ensure that the NILFS_I_COLLECTED flag is cleared. Link: https://lkml.kernel.org/r/20240814101119.4070-1-konishi.ryusuke@gmail.com Fixes: a694291a6211 ("nilfs2: separate wait function from nilfs_segctor_write") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>