summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-01-10MAINTAINERS: powerpc: Update my statusMichael Ellerman
Maddy is taking over the day-to-day maintenance of powerpc. I will still be around to help, and as a backup. Re-order the main POWERPC list to put Maddy first to reflect that. KVM/powerpc patches will be handled by Maddy via the powerpc tree with review from Nick, so replace myself with Maddy there. Remove myself from BPF, leaving Hari & Christophe as maintainers. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-01-10io_uring: expose read/write attribute capabilityAnuj Gupta
After commit 9a213d3b80c0, we can pass additional attributes along with read/write. However, userspace doesn't know that. Add a new feature flag IORING_FEAT_RW_ATTR, to notify the userspace that the kernel has this ability. Signed-off-by: Anuj Gupta <anuj20.g@samsung.com> Reviewed-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Tested-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/20241205062109.1788-1-anuj20.g@samsung.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-01-10smb: client: sync the root session and superblock context passwords before ↵Meetakshi Setiya
automounting In some cases, when password2 becomes the working password, the client swaps the two password fields in the root session struct, but not in the smb3_fs_context struct in cifs_sb. DFS automounts inherit fs context from their parent mounts. Therefore, they might end up getting the passwords in the stale order. The automount should succeed, because the mount function will end up retrying with the actual password anyway. But to reduce these unnecessary session setup retries for automounts, we can sync the parent context's passwords with the root session's passwords before duplicating it to the child's fs context. Cc: stable@vger.kernel.org Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-01-10Merge tag 'sched_ext-for-6.13-rc6-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext fixes from Tejun Heo: - Fix corner case bug where ops.dispatch() couldn't extend the execution of the current task if SCX_OPS_ENQ_LAST is set. - Fix ops.cpu_release() not being called when a SCX task is preempted by a higher priority sched class task. - Fix buitin idle mask being incorrectly left as busy after an idle CPU is picked and kicked. - scx_ops_bypass() was unnecessarily using rq_lock() which comes with rq pinning related sanity checks which could trigger spuriously. Switch to raw_spin_rq_lock(). * tag 'sched_ext-for-6.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: sched_ext: idle: Refresh idle masks during idle-to-idle transitions sched_ext: switch class when preempted by higher priority scheduler sched_ext: Replace rq_lock() to raw_spin_rq_lock() in scx_ops_bypass() sched_ext: keep running prev when prev->scx.slice != 0
2025-01-10Merge tag 'cgroup-for-6.13-rc6-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: "Cpuset fixes: - Fix isolated CPUs leaking into sched domains - Remove now unnecessary kernfs active break which can trigger a warning - Comment updates" * tag 'cgroup-for-6.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup/cpuset: remove kernfs active break cgroup/cpuset: Prevent leakage of isolated CPUs into sched domains cgroup/cpuset: Remove stale text
2025-01-10Merge tag 'wq-for-6.13-rc6-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue fix from Tejun Heo: - Add a WARN_ON_ONCE() on queue_delayed_work_on() on an offline CPU as such work items won't get executed till the CPU comes back online * tag 'wq-for-6.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: warn if delayed_work is queued to an offlined cpu.
2025-01-10Merge tag 'thermal-6.13-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control fix from Rafael Wysocki: "Fix an OF node leak in the code parsing thermal zone DT properties (Joe Hattori)" * tag 'thermal-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal: of: fix OF node leak in of_thermal_zone_find()
2025-01-10wifi: ath9k: cleanup ath9k_hw_get_nf_hist_mid()Dmitry Antipov
In 'ath9k_hw_get_nf_hist_mid()', prefer 'memcpy()' and 'sort()' over an ad-hoc things. Briefly tested as a separate module. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> Link: https://patch.msgid.link/20250109080703.106692-1-dmantipov@yandex.ru Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
2025-01-10Merge tag 'acpi-6.13-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "Add two more ACPI IRQ override quirks and update the code using them to avoid unnecessary overhead (Hans de Goede)" * tag 'acpi-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: resource: acpi_dev_irq_override(): Check DMI match last ACPI: resource: Add TongFang GM5HG0A to irq1_edge_low_force_override[] ACPI: resource: Add Asus Vivobook X1504VAP to irq1_level_low_skip_override[]
2025-01-10sched_ext: idle: Refresh idle masks during idle-to-idle transitionsAndrea Righi
With the consolidation of put_prev_task/set_next_task(), see commit 436f3eed5c69 ("sched: Combine the last put_prev_task() and the first set_next_task()"), we are now skipping the transition between these two functions when the previous and the next tasks are the same. As a result, the scx idle state of a CPU is updated only when transitioning to or from the idle thread. While this is generally correct, it can lead to uneven and inefficient core utilization in certain scenarios [1]. A typical scenario involves proactive wake-ups: scx_bpf_pick_idle_cpu() selects and marks an idle CPU as busy, followed by a wake-up via scx_bpf_kick_cpu(), without dispatching any tasks. In this case, the CPU continues running the idle thread, returns to idle, but remains marked as busy, preventing it from being selected again as an idle CPU (until a task eventually runs on it and releases the CPU). For example, running a workload that uses 20% of each CPU, combined with an scx scheduler using proactive wake-ups, results in the following core utilization: CPU 0: 25.7% CPU 1: 29.3% CPU 2: 26.5% CPU 3: 25.5% CPU 4: 0.0% CPU 5: 25.5% CPU 6: 0.0% CPU 7: 10.5% To address this, refresh the idle state also in pick_task_idle(), during idle-to-idle transitions, but only trigger ops.update_idle() on actual state changes to prevent unnecessary updates to the scx scheduler and maintain balanced state transitions. With this change in place, the core utilization in the previous example becomes the following: CPU 0: 18.8% CPU 1: 19.4% CPU 2: 18.0% CPU 3: 18.7% CPU 4: 19.3% CPU 5: 18.9% CPU 6: 18.7% CPU 7: 19.3% [1] https://github.com/sched-ext/scx/pull/1139 Fixes: 7c65ae81ea86 ("sched_ext: Don't call put_prev_task_scx() before picking the next task") Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-10veristat: Document verifier log dumping capabilityDaniel Xu
`-vl2` is a useful combination of flags to dump the entire verification log. This is helpful when making changes to the verifier, as you can see what it thinks program one instruction at a time. This was more or less a hidden feature before. Document it so others can discover it. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/d57bbcca81e06ae8dcdadaedb99a48dced67e422.1736466129.git.dxu@dxuuu.xyz
2025-01-10bpftool: Fix control flow graph segfault during edge creationChristoph Werle
If the last instruction of a control flow graph building block is a BPF_CALL, an incorrect edge with e->dst set to NULL is created and results in a segfault during graph output. Ensure that BPF_CALL as last instruction of a building block is handled correctly and only generates a single edge unlike actual BPF_JUMP* instructions. Signed-off-by: Christoph Werle <christoph.werle@longjmp.de> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Quentin Monnet <qmo@kernel.org> Reviewed-by: Quentin Monnet <qmo@kernel.org> Link: https://lore.kernel.org/bpf/20250108220937.1470029-1-christoph.werle@longjmp.de
2025-01-11Merge tag 'cgroup-dmem-drm-v2' of ↵Dave Airlie
git://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into drm-next DMEM cgroup pull request This introduces a new cgroup controller to limit the device memory. Notable users would be DRM, dma-buf heaps, or v4l2. This pull request is based on the series developped by Maarten Lankhorst, Friedrich Vock, and I: https://lore.kernel.org/all/20241204134410.1161769-1-dev@lankhorst.se/ Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250110-cryptic-warm-mandrill-b71f5d@houat
2025-01-10Merge tag 'linux-cpupower-6.14-rc1' of ↵Rafael J. Wysocki
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/shuah/linux Merge cpupower utility updates for 6.14 from Shuah Khan: "Several fixes, cleanups and AMD support enhancements: - fix TSC MHz calculation - Add install and uninstall options to bindings makefile - Add header changes for cpufreq.h to SWIG bindings - selftests/cpufreq: gitignore output files and clean them in make clean - Remove spurious return statement - Add support for parsing 'enabled' or 'disabled' strings from table - Add support for amd-pstate preferred core rankings - Don't try to read frequency from hardware when kernel uses aperf mperf - Add support for showing energy performance preference - Don't fetch maximum latency when EPP is enabled - Adjust whitespace for amd-pstate specific prints - Fix cross compilation - revise is_valid flag handling for idle_monitor" * tag 'linux-cpupower-6.14-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/shuah/linux: pm: cpupower: Add header changes for cpufreq.h to SWIG bindings pm: cpupower: Add install and uninstall options to bindings makefile cpupower: Adjust whitespace for amd-pstate specific prints cpupower: Don't fetch maximum latency when EPP is enabled cpupower: Add support for showing energy performance preference cpupower: Don't try to read frequency from hardware when kernel uses aperfmperf cpupower: Add support for amd-pstate preferred core rankings cpupower: Add support for parsing 'enabled' or 'disabled' strings from table cpupower: Remove spurious return statement cpupower: fix TSC MHz calculation cpupower: revise is_valid flag handling for idle_monitor pm: cpupower: Makefile: Fix cross compilation selftests/cpufreq: gitignore output files and clean them in make clean
2025-01-10selftests/bpf: Add a test for kprobe multi with unique_matchYonghong Song
Add a kprobe multi subtest to test kprobe multi unique_match option. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250109174028.3368967-1-yonghong.song@linux.dev
2025-01-10libbpf: Add unique_match option for multi kprobeYonghong Song
Jordan reported an issue in Meta production environment where func try_to_wake_up() is renamed to try_to_wake_up.llvm.<hash>() by clang compiler at lto mode. The original 'kprobe/try_to_wake_up' does not work any more since try_to_wake_up() does not match the actual func name in /proc/kallsyms. There are a couple of ways to resolve this issue. For example, in attach_kprobe(), we could do lookup in /proc/kallsyms so try_to_wake_up() can be replaced by try_to_wake_up.llvm.<hach>(). Or we can force users to use bpf_program__attach_kprobe() where they need to lookup /proc/kallsyms to find out try_to_wake_up.llvm.<hach>(). But these two approaches requires extra work by either libbpf or user. Luckily, suggested by Andrii, multi kprobe already supports wildcard ('*') for symbol matching. In the above example, 'try_to_wake_up*' can match to try_to_wake_up() or try_to_wake_up.llvm.<hash>() and this allows bpf prog works for different kernels as some kernels may have try_to_wake_up() and some others may have try_to_wake_up.llvm.<hash>(). The original intention is to kprobe try_to_wake_up() only, so an optional field unique_match is added to struct bpf_kprobe_multi_opts. If the field is set to true, the number of matched functions must be one. Otherwise, the attachment will fail. In the above case, multi kprobe with 'try_to_wake_up*' and unique_match preserves user functionality. Reported-by: Jordan Rome <linux@jordanrome.com> Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250109174023.3368432-1-yonghong.song@linux.dev
2025-01-10io_uring: don't touch sqd->thread off tw addPavel Begunkov
With IORING_SETUP_SQPOLL all requests are created by the SQPOLL task, which means that req->task should always match sqd->thread. Since accesses to sqd->thread should be separately protected, use req->task in io_req_normal_work_add() instead. Note, in the eyes of io_req_normal_work_add(), the SQPOLL task struct is always pinned and alive, and sqd->thread can either be the task or NULL. It's only problematic if the compiler decides to reload the value after the null check, which is not so likely. Cc: stable@vger.kernel.org Cc: Bui Quang Minh <minhquangbui99@gmail.com> Reported-by: lizetao <lizetao1@huawei.com> Fixes: 78f9b61bd8e54 ("io_uring: wake SQPOLL task when task_work is added to an empty queue") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1cbbe72cf32c45a8fee96026463024cd8564a7d7.1736541357.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-01-10io_uring/sqpoll: zero sqd->thread on tctx errorsPavel Begunkov
Syzkeller reports: BUG: KASAN: slab-use-after-free in thread_group_cputime+0x409/0x700 kernel/sched/cputime.c:341 Read of size 8 at addr ffff88803578c510 by task syz.2.3223/27552 Call Trace: <TASK> ... kasan_report+0x143/0x180 mm/kasan/report.c:602 thread_group_cputime+0x409/0x700 kernel/sched/cputime.c:341 thread_group_cputime_adjusted+0xa6/0x340 kernel/sched/cputime.c:639 getrusage+0x1000/0x1340 kernel/sys.c:1863 io_uring_show_fdinfo+0xdfe/0x1770 io_uring/fdinfo.c:197 seq_show+0x608/0x770 fs/proc/fd.c:68 ... That's due to sqd->task not being cleared properly in cases where SQPOLL task tctx setup fails, which can essentially only happen with fault injection to insert allocation errors. Cc: stable@vger.kernel.org Fixes: 1251d2025c3e1 ("io_uring/sqpoll: early exit thread if task_context wasn't allocated") Reported-by: syzbot+3d92cfcfa84070b0a470@syzkaller.appspotmail.com Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/efc7ec7010784463b2e7466d7b5c02c2cb381635.1736519461.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-01-10Merge tag 'drm-fixes-2025-01-11' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes from Dave Airlie: "Regular weekly fixes, this has the usual amdgpu/xe/i915 bits. There is a bigger bunch of mediatek patches that I considered not including at this stage, but all the changes (except for one were obvious small fixes, and the rotation one is a few lines, and I suppose will help someone have their screen up the right way), I decided to include it since I expect it got slowed down by holidays etc, and it's not that mainstream a hw platform. i915: - Revert "drm/i915/hdcp: Don't enable HDCP1.4 directly from check_link" amdgpu: - Display interrupt fixes - Fix display max surface mismatches - Fix divide error in DM plane scale calcs - Display divide by 0 checks in dml helpers - SMU 13 AD/DC interrrupt handling fix - Fix locking around buddy trim handling amdkfd: - Fix page fault with shader debugger enabled - Fix eviction fence wq handling xe: - Avoid a NULL ptr deref when wedging - Fix power gate sequence on DG1 mediatek: - Revert "drm/mediatek: dsi: Correct calculation formula of PHY Timing" - Set private->all_drm_private[i]->drm to NULL if mtk_drm_bind returns err - Move mtk_crtc_finish_page_flip() to ddp_cmdq_cb() - Only touch DISP_REG_OVL_PITCH_MSB if AFBC is supported - Add support for 180-degree rotation in the display driver - Stop selecting foreign drivers - Revert "drm/mediatek: Switch to for_each_child_of_node_scoped()" - Fix YCbCr422 color format issue for DP - Fix mode valid issue for dp - dp: Reference common DAI properties - dsi: Add registers to pdata to fix MT8186/MT8188 - Remove unneeded semicolon - Add return value check when reading DPCD - Initialize pointer in mtk_drm_of_ddp_path_build_one()" * tag 'drm-fixes-2025-01-11' of https://gitlab.freedesktop.org/drm/kernel: (26 commits) drm/xe/dg1: Fix power gate sequence. drm/xe: Fix tlb invalidation when wedging Revert "drm/i915/hdcp: Don't enable HDCP1.4 directly from check_link" drm/amdgpu: Add a lock when accessing the buddy trim function drm/amd/pm: fix BUG: scheduling while atomic drm/amdkfd: wq_release signals dma_fence only when available drm/amd/display: Add check for granularity in dml ceil/floor helpers drm/amdkfd: fixed page fault when enable MES shader debugger drm/amd/display: fix divide error in DM plane scale calcs drm/amd/display: increase MAX_SURFACES to the value supported by hw drm/amd/display: fix page fault due to max surface definition mismatch drm/amd/display: Remove unnecessary amdgpu_irq_get/put drm/mediatek: Initialize pointer in mtk_drm_of_ddp_path_build_one() drm/mediatek: Add return value check when reading DPCD drm/mediatek: Remove unneeded semicolon drm/mediatek: mtk_dsi: Add registers to pdata to fix MT8186/MT8188 dt-bindings: display: mediatek: dp: Reference common DAI properties drm/mediatek: Fix mode valid issue for dp drm/mediatek: Fix YCbCr422 color format issue for DP Revert "drm/mediatek: Switch to for_each_child_of_node_scoped()" ...
2025-01-10wifi: ath12k: Support pdev Puncture StatsRajat Soni
Add support to request pdev puncture stats from firmware through HTT stats type 46. These stats give the count of number of subbands used in different wifi standards. Sample output: ------------- echo 46 > /sys/kernel/debug/ath12k/pci-0000\:06\:00.0/mac0/htt_stats_type cat /sys/kernel/debug/ath12k/pci-0000\:06\:00.0/mac0/htt_stats HTT_PDEV_PUNCTURE_STATS_TLV: mac_id = 0 tx_ofdm_su_last_used_pattern_mask = 0x00000001 tx_ofdm_su_num_subbands_used_cnt_01 = 217 tx_ofdm_su_num_subbands_used_cnt_02 = 0 tx_ofdm_su_num_subbands_used_cnt_03 = 0 ..... HTT_PDEV_PUNCTURE_STATS_TLV: mac_id = 0 tx_ax_dl_mu_ofdma_last_used_pattern_mask = 0x00000000 tx_ax_dl_mu_ofdma_num_subbands_used_cnt_01 = 0 tx_ax_dl_mu_ofdma_num_subbands_used_cnt_02 = 0 tx_ax_dl_mu_ofdma_num_subbands_used_cnt_03 = 0 ..... HTT_PDEV_PUNCTURE_STATS_TLV: mac_id = 0 tx_be_dl_mu_ofdma_last_used_pattern_mask = 0x00000000 tx_be_dl_mu_ofdma_num_subbands_used_cnt_01 = 0 tx_be_dl_mu_ofdma_num_subbands_used_cnt_02 = 0 tx_be_dl_mu_ofdma_num_subbands_used_cnt_03 = 0 ..... HTT_PDEV_PUNCTURE_STATS_TLV: mac_id = 0 rx_ax_ul_mu_ofdma_last_used_pattern_mask = 0x00000000 rx_ax_ul_mu_ofdma_num_subbands_used_cnt_01 = 0 rx_ax_ul_mu_ofdma_num_subbands_used_cnt_02 = 0 rx_ax_ul_mu_ofdma_num_subbands_used_cnt_03 = 0 ..... HTT_PDEV_PUNCTURE_STATS_TLV: mac_id = 0 rx_be_ul_mu_ofdma_last_used_pattern_mask = 0x00000000 rx_be_ul_mu_ofdma_num_subbands_used_cnt_01 = 0 rx_be_ul_mu_ofdma_num_subbands_used_cnt_02 = 0 rx_be_ul_mu_ofdma_num_subbands_used_cnt_03 = 0 ..... Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1 Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0-03427-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.15378.4 Signed-off-by: Rajat Soni <quic_rajson@quicinc.com> Signed-off-by: Roopni Devanathan <quic_rdevanat@quicinc.com> Acked-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/20241218035711.2573584-3-quic_rdevanat@quicinc.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
2025-01-10wifi: ath12k: Support AST Entry StatsRoopni Devanathan
Add support to request Address Search Table(AST) entries stats from firmware through HTT stats type 41. These stats give AST entries related information such as software peer id, MAC address, pdev id, vdev, id, next hop, etc. Sample output: ------------- echo 41 > /sys/kernel/debug/ath12k/pci-0000\:06\:00.0/mac0/htt_stats_type cat /sys/kernel/debug/ath12k/pci-0000\:06\:00.0/mac0/htt_stats HTT_AST_ENTRY_TLV: ast_index = 10 mac_addr = 00:00:00:01:00:00 sw_peer_id = 0 pdev_id = 3 vdev_id = 255 next_hop = 0 mcast = 0 monitor_direct = 0 mesh_sta = 0 mec = 0 intra_bss = 0 Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0.1-00029-QCAHKSWPL_SILICONZ-1 Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0-03427-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.15378.4 Signed-off-by: Roopni Devanathan <quic_rdevanat@quicinc.com> Acked-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/20241218035711.2573584-2-quic_rdevanat@quicinc.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
2025-01-10wifi: ath12k: Support Transmit Buffer OFDMA StatsPradeep Kumar Chitrapu
Add support to request OFDMA stats of transmit buffers from firmware through HTT stats type 32. These stats give information about NDPA, NDP, BRP and steering mechanisms. Note: WCN7850 firmware version - WLAN.HMT.1.0-03427-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.15378.4 does not support HTT stats type 32. Sample output: ------------- echo 32 > /sys/kernel/debug/ath12k/pci-0000\:06\:00.0/mac0/htt_stats_type cat /sys/kernel/debug/ath12k/pci-0000\:06\:00.0/mac0/htt_stats HTT_TXBF_OFDMA_AX_NDPA_STATS_TLV: ax_ofdma_ndpa_queued = 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0, 16:0, 17:0, 18:0, 19:0, 20:0, 21:0, 22:0, 23:0, 24:0, 25:0, 26:0, 27:0, 28:0, 29:0, 30:0, 31:0, 32:0, 33:0, 34:0, 35:0, 36:0, 37:0 ax_ofdma_ndpa_tried = 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0, 16:0, 17:0, 18:0, 19:0, 20:0, 21:0, 22:0, 23:0, 24:0, 25:0, 26:0, 27:0, 28:0, 29:0, 30:0, 31:0, 32:0, 33:0, 34:0, 35:0, 36:0, 37:0 ..... HTT_TXBF_OFDMA_AX_NDP_STATS_TLV: ax_ofdma_ndp_queued = 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0, 16:0, 17:0, 18:0, 19:0, 20:0, 21:0, 22:0, 23:0, 24:0, 25:0, 26:0, 27:0, 28:0, 29:0, 30:0, 31:0, 32:0, 33:0, 34:0, 35:0, 36:0, 37:0 ax_ofdma_ndp_tried = 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0, 16:0, 17:0, 18:0, 19:0, 20:0, 21:0, 22:0, 23:0, 24:0, 25:0, 26:0, 27:0, 28:0, 29:0, 30:0, 31:0, 32:0, 33:0, 34:0, 35:0, 36:0, 37:0 ..... HTT_TXBF_OFDMA_AX_BRP_STATS_TLV: ax_ofdma_brpoll_queued = 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0, 16:0, 17:0, 18:0, 19:0, 20:0, 21:0, 22:0, 23:0, 24:0, 25:0, 26:0, 27:0, 28:0, 29:0, 30:0, 31:0, 32:0, 33:0, 34:0, 35:0, 36:0, 37:0 ax_ofdma_brpoll_tied = 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0, 16:0, 17:0, 18:0, 19:0, 20:0, 21:0, 22:0, 23:0, 24:0, 25:0, 26:0, 27:0, 28:0, 29:0, 30:0, 31:0, 32:0, 33:0, 34:0, 35:0, 36:0, 37:0 ..... HTT_TXBF_OFDMA_AX_STEER_STATS_TLV: ax_ofdma_num_ppdu_steer = 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0, 16:0, 17:0, 18:0, 19:0, 20:0, 21:0, 22:0, 23:0, 24:0, 25:0, 26:0, 27:0, 28:0, 29:0, 30:0, 31:0, 32:0, 33:0, 34:0, 35:0, 36:0, 37:0 ax_ofdma_num_usrs_prefetch = 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0, 14:0, 15:0, 16:0, 17:0, 18:0, 19:0, 20:0, 21:0, 22:0, 23:0, 24:0, 25:0, 26:0, 27:0, 28:0, 29:0, 30:0, 31:0, 32:0, 33:0, 34:0, 35:0, 36:0, 37:0 ..... HTT_TXBF_OFDMA_AX_STEER_MPDU_STATS_TLV: rbo_steer_mpdus_tried = 0 rbo_steer_mpdus_failed = 0 sifs_steer_mpdus_tried = 0 sifs_steer_mpdus_failed = 0 Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.1.1-00214-QCAHKSWPL_SILICONZ-1 Signed-off-by: Pradeep Kumar Chitrapu <quic_pradeepc@quicinc.com> Signed-off-by: Roopni Devanathan <quic_rdevanat@quicinc.com> Acked-by: Jeff Johnson <jjohnson@kernel.org> Link: https://patch.msgid.link/20241128110949.3672364-3-quic_rdevanat@quicinc.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
2025-01-10wifi: ath12k: Support Transmit Rate Buffer StatsPradeep Kumar Chitrapu
Add support to request transmit rate buffer stats from firmware through HTT stats type 31. These stats give information such as MCS, NSS and bandwidth of transmit and input buffer. Sample output: ------------- echo 31 > /sys/kernel/debug/ath12k/pci-0000\:06\:00.0/mac0/htt_stats_type cat /sys/kernel/debug/ath12k/pci-0000\:06\:00.0/mac0/htt_stats HTT_STATS_PDEV_TX_RATE_TXBF_STATS: Legacy OFDM Rates: 6 Mbps: 0, 9 Mbps: 0, 12 Mbps: 0, 18 Mbps: 0 24 Mbps: 0, 36 Mbps: 0, 48 Mbps: 0, 54 Mbps: 0 tx_ol_mcs = 0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0 tx_ibf_mcs = 0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0 tx_txbf_mcs = 0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0, 8:0, 9:0, 10:0, 11:0, 12:0, 13:0 tx_ol_nss = 0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0 tx_ibf_nss = 0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0 tx_txbf_nss = 0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0 tx_ol_bw = 0:0, 1:0, 2:0, 3:0, 4:0 half_tx_ol_bw = 0:0, 1:0, 2:0, 3:0, 4:0 quarter_tx_ol_bw = 0:0, 1:0, 2:0, 3:0, 4:0 tx_ibf_bw = 0:0, 1:0, 2:0, 3:0, 4:0 half_tx_ibf_bw = 0:0, 1:0, 2:0, 3:0, 4:0 quarter_tx_ibf_bw = 0:0, 1:0, 2:0, 3:0, 4:0 tx_txbf_bw = 0:0, 1:0, 2:0, 3:0, 4:0 half_tx_txbf_bw = 0:0, 1:0, 2:0, 3:0, 4:0 quarter_tx_txbf_bw = 0:0, 1:0, 2:0, 3:0, 4:0 HTT_STATS_PDEV_TXBF_FLAG_RETURN_STATS: TXBF_reason_code_stats: 0:0, 1:0, 2:0, 3:0, 4:0, 5:0, 6:0, 7:0 Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.1.1-00214-QCAHKSWPL_SILICONZ-1 Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3 Signed-off-by: Pradeep Kumar Chitrapu <quic_pradeepc@quicinc.com> Signed-off-by: Roopni Devanathan <quic_rdevanat@quicinc.com> Acked-by: Kalle Valo <kvalo@kernel.org> Acked-by: Jeff Johnson <jjohnson@kernel.org> Link: https://patch.msgid.link/20241128110949.3672364-2-quic_rdevanat@quicinc.com Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
2025-01-10Merge tag 'riscv-for-linus-6.13-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - a handful of selftest fixes - fix a memory leak in relocation processing during module loading - avoid sleeping in die() - fix kprobe instruction slot address calculations - fix DT node reference leak in SBI idle probing - avoid initializing out of bounds pages on sparse vmemmap systems with a gap at the start of their physical memory map - fix backtracing through exceptions - _Q_PENDING_LOOPS is now defined whenever QUEUED_SPINLOCKS=y - local labels in entry.S are now marked with ".L", which prevents them from trashing backtraces - a handful of fixes for SBI-based performance counters * tag 'riscv-for-linus-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: drivers/perf: riscv: Do not allow invalid raw event config drivers/perf: riscv: Return error for default case drivers/perf: riscv: Fix Platform firmware event data tools: selftests: riscv: Add test count for vstate_prctl tools: selftests: riscv: Add pass message for v_initval_nolibc riscv: use local label names instead of global ones in assembly riscv: qspinlock: Fixup _Q_PENDING_LOOPS definition riscv: stacktrace: fix backtracing through exceptions riscv: mm: Fix the out of bound issue of vmemmap address cpuidle: riscv-sbi: fix device node release in early exit of for_each_possible_cpu riscv: kprobes: Fix incorrect address calculation riscv: Fix sleeping in invalid context in die() riscv: module: remove relocation_head rel_entry member allocation riscv: selftests: Fix warnings pointer masking test
2025-01-10drm/amd/display: Initialize denominator defaults to 1Alex Hung
[WHAT & HOW] Variables, used as denominators and maybe not assigned to other values, should be initialized to non-zero to avoid DIVIDE_BY_ZERO, as reported by Coverity. Reviewed-by: Austin Zheng <austin.zheng@amd.com> Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit e2c4c6c10542ccfe4a0830bb6c9fd5b177b7bbb7)
2025-01-10drm/amd/display: Use HW lock mgr for PSR1Tom Chung
[Why] Without the dmub hw lock, it may cause the lock timeout issue while do modeset on PSR1 eDP panel. [How] Allow dmub hw lock for PSR1. Reviewed-by: Sun peng Li <sunpeng.li@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a2b5a9956269f4c1a09537177f18ab0229fe79f7)
2025-01-10drm/amd/display: Remove unnecessary eDP power downYiling Chen
[why] When first time of link training is fail, eDP would be powered down and would not be powered up for next retry link training. It causes that all of retry link linking would be fail. [how] We has extracted both power up and down sequence from enable/disable link output function before DCN32. We remov eDP power down in dcn32_disable_link_output(). Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Yiling Chen <yi-ling.chen2@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f5860c88cdfe7300d08c1aef881bba0cac369e34)
2025-01-10drm/amd/display: Do not elevate mem_type change to full updateLeo Li
[Why] There should not be any need to revalidate bandwidth on memory placement change, since the fb is expected to be pinned to DCN-accessable memory before scanout. For APU it's DRAM, and DGPU, it's VRAM. However, async flips + memory type change needs to be rejected. [How] Do not set lock_and_validation_needed on mem_type change. Instead, reject an async_flip request if the crtc's buffer(s) changed mem_type. This may fix stuttering/corruption experienced with PSR SU and PSR1 panels, if the compositor allocates fbs in both VRAM carveout and GTT and flips between them. Fixes: a7c0cad0dc06 ("drm/amd/display: ensure async flips are only accepted for fast updates") Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 4caacd1671b7a013ad04cd8b6398f002540bdd4d) Cc: stable@vger.kernel.org
2025-01-10drm/amd/display: Do not wait for PSR disable on vbl enableLeo Li
[Why] Outside of a modeset/link configuration change, we should not have to wait for the panel to exit PSR. Depending on the panel and it's state, it may take multiple frames for it to exit PSR. Therefore, waiting in all scenarios may cause perceived stuttering, especially in combination with faster vblank shutdown. [How] PSR1 disable is hooked up to the vblank enable event, and vice versa. In case of vblank enable, do not wait for panel to exit PSR, but still wait in all other cases. We also avoid a call to unnecessarily change power_opts on disable - this ends up sending another command to dmcub fw. When testing against IGT, some crc tests like kms_plane_alpha_blend and amd_hotplug were failing due to CRC timeouts. This was found to be caused by the early return before HW has fully exited PSR1. Fix this by first making sure we grab a vblank reference, then waiting for panel to exit PSR1, before programming hw for CRC generation. Fixes: 58a261bfc967 ("drm/amd/display: use a more lax vblank enable policy for older ASICs") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3743 Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit aa6713fa2046f4c09bf3013dd1420ae15603ca6f) Cc: stable@vger.kernel.org
2025-01-10workqueue: warn if delayed_work is queued to an offlined cpu.Imran Khan
delayed_work submitted to an offlined cpu, will not get executed, after the specified delay if the cpu remains offline. If the cpu never comes online the work will never get executed. checking for online cpu in __queue_delayed_work, does not sound like a good idea because to do this reliably we need hotplug lock and since work may be submitted from atomic contexts, we would have to use cpus_read_trylock. But if trylock fails we would queue the work on any cpu and this may not be optimal because our intended cpu might still be online. Putting a WARN_ON_ONCE for an already offlined cpu, will indicate users of queue_delayed_work_on, if they are (wrongly) trying to queue delayed_work on offlined cpu. Also indicate the problem of using offlined cpu with queue_delayed_work_on, in its description. Signed-off-by: Imran Khan <imran.f.khan@oracle.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-10Revert "drm/amd/display: Enable urgent latency adjustments for DCN35"Nicholas Susanto
Revert commit 284f141f5ce5 ("drm/amd/display: Enable urgent latency adjustments for DCN35") [Why & How] Urgent latency increase caused 2.8K OLED monitor caused it to block this panel support P0. Reverting this change does not reintroduce the netflix corruption issue which it fixed. Fixes: 284f141f5ce5 ("drm/amd/display: Enable urgent latency adjustments for DCN35") Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Nicholas Susanto <Nicholas.Susanto@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit c7ccfc0d4241a834c25a9a9e1e78b388b4445d23) Cc: stable@vger.kernel.org
2025-01-10drm/amd/display: Reduce accessing remote DPCD overheadWayne Lin
[Why] Observed frame rate get dropped by tool like glxgear. Even though the output to monitor is 60Hz, the rendered frame rate drops to 30Hz lower. It's due to code path in some cases will trigger dm_dp_mst_is_port_support_mode() to read out remote Link status to assess the available bandwidth for dsc maniplation. Overhead of keep reading remote DPCD is considerable. [How] Store the remote link BW in mst_local_bw and use end-to-end full_pbn as an indicator to decide whether update the remote link bw or not. Whenever we need the info to assess the BW, visit the stored one first. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3720 Fixes: fa57924c76d9 ("drm/amd/display: Refactor function dm_dp_mst_is_port_support_mode()") Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jerry Zuo <jerry.zuo@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 4a9a918545455a5979c6232fcf61ed3d8f0db3ae) Cc: stable@vger.kernel.org
2025-01-10drm/amd/display: Validate mdoe under MST LCT=1 case as wellWayne Lin
[Why & How] Currently in dm_dp_mst_is_port_support_mode(), when valdidating mode under dsc decoding at the last DP link config, we only validate the case when there is an UFP. However, if the MSTB LCT=1, there is no UFP. Under this case, use root_link_bw_in_kbps as the available bw to compare. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3720 Fixes: fa57924c76d9 ("drm/amd/display: Refactor function dm_dp_mst_is_port_support_mode()") Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jerry Zuo <jerry.zuo@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a04d9534a8a75b2806c5321c387be450c364b55e) Cc: stable@vger.kernel.org
2025-01-10drm/amdgpu/smu13: update powersave optimizationsAlex Deucher
Only apply when compute profile is selected. This is the only supported configuration. Selecting other profiles can lead to performane degradations. Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d477e39532d725b1cdb3c8005c689c74ffbf3b94) Cc: stable@vger.kernel.org # 6.12.x
2025-01-10sched_ext: Use time helpers in BPF schedulersChangwoo Min
Modify the BPF schedulers to use time helpers defined in common.bpf.h Signed-off-by: Changwoo Min <changwoo@igalia.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-10sched_ext: Replace bpf_ktime_get_ns() to scx_bpf_now()Changwoo Min
In the BPF schedulers that use bpf_ktime_get_ns() -- scx_central and scx_flatcg, replace bpf_ktime_get_ns() calls to scx_bpf_now(). Signed-off-by: Changwoo Min <changwoo@igalia.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-10sched_ext: Add time helpers for BPF schedulersChangwoo Min
The following functions are added for BPF schedulers: - time_delta(after, before) - time_after(a, b) - time_before(a, b) - time_after_eq(a, b) - time_before_eq(a, b) - time_in_range(a, b, c) - time_in_range_open(a, b, c) Signed-off-by: Changwoo Min <changwoo@igalia.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-10sched_ext: Add scx_bpf_now() for BPF schedulerChangwoo Min
scx_bpf_now() is added to the header files so the BPF scheduler can use it. Signed-off-by: Changwoo Min <changwoo@igalia.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-10sched_ext: Implement scx_bpf_now()Changwoo Min
Returns a high-performance monotonically non-decreasing clock for the current CPU. The clock returned is in nanoseconds. It provides the following properties: 1) High performance: Many BPF schedulers call bpf_ktime_get_ns() frequently to account for execution time and track tasks' runtime properties. Unfortunately, in some hardware platforms, bpf_ktime_get_ns() -- which eventually reads a hardware timestamp counter -- is neither performant nor scalable. scx_bpf_now() aims to provide a high-performance clock by using the rq clock in the scheduler core whenever possible. 2) High enough resolution for the BPF scheduler use cases: In most BPF scheduler use cases, the required clock resolution is lower than the most accurate hardware clock (e.g., rdtsc in x86). scx_bpf_now() basically uses the rq clock in the scheduler core whenever it is valid. It considers that the rq clock is valid from the time the rq clock is updated (update_rq_clock) until the rq is unlocked (rq_unpin_lock). 3) Monotonically non-decreasing clock for the same CPU: scx_bpf_now() guarantees the clock never goes backward when comparing them in the same CPU. On the other hand, when comparing clocks in different CPUs, there is no such guarantee -- the clock can go backward. It provides a monotonically *non-decreasing* clock so that it would provide the same clock values in two different scx_bpf_now() calls in the same CPU during the same period of when the rq clock is valid. An rq clock becomes valid when it is updated using update_rq_clock() and invalidated when the rq is unlocked using rq_unpin_lock(). Let's suppose the following timeline in the scheduler core: T1. rq_lock(rq) T2. update_rq_clock(rq) T3. a sched_ext BPF operation T4. rq_unlock(rq) T5. a sched_ext BPF operation T6. rq_lock(rq) T7. update_rq_clock(rq) For [T2, T4), we consider that rq clock is valid (SCX_RQ_CLK_VALID is set), so scx_bpf_now() calls during [T2, T4) (including T3) will return the rq clock updated at T2. For duration [T4, T7), when a BPF scheduler can still call scx_bpf_now() (T5), we consider the rq clock is invalid (SCX_RQ_CLK_VALID is unset at T4). So when calling scx_bpf_now() at T5, we will return a fresh clock value by calling sched_clock_cpu() internally. Also, to prevent getting outdated rq clocks from a previous scx scheduler, invalidate all the rq clocks when unloading a BPF scheduler. One example of calling scx_bpf_now(), when the rq clock is invalid (like T5), is in scx_central [1]. The scx_central scheduler uses a BPF timer for preemptive scheduling. In every msec, the timer callback checks if the currently running tasks exceed their timeslice. At the beginning of the BPF timer callback (central_timerfn in scx_central.bpf.c), scx_central gets the current time. When the BPF timer callback runs, the rq clock could be invalid, the same as T5. In this case, scx_bpf_now() returns a fresh clock value rather than returning the old one (T2). [1] https://github.com/sched-ext/scx/blob/main/scheds/c/scx_central.bpf.c Signed-off-by: Changwoo Min <changwoo@igalia.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-10sched_ext: Relocate scx_enabled() related codeChangwoo Min
scx_enabled() will be used in scx_rq_clock_update/invalidate() in the following patch, so relocate the scx_enabled() related code to the proper location. Signed-off-by: Changwoo Min <changwoo@igalia.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-10soc/tegra: fuse: Update Tegra234 nvmem keepout listKartik Rajput
Various Nvidia userspace applications and tests access following fuse via Fuse nvmem interface: * odmid * odminfo * boot_security_info * public_key_hash * reserved_odm0 * reserved_odm1 * reserved_odm2 * reserved_odm3 * reserved_odm4 * reserved_odm5 * reserved_odm6 * reserved_odm7 * odm_lock * pk_h1 * pk_h2 * revoke_pk_h0 * revoke_pk_h1 * security_mode * system_fw_field_ratchet0 * system_fw_field_ratchet1 * system_fw_field_ratchet2 * system_fw_field_ratchet3 * optin_enable Update tegra234_fuse_keepouts list to allow reading these fuse from nvmem sysfs interface. Signed-off-by: Kartik Rajput <kkartik@nvidia.com> Link: https://lore.kernel.org/r/20241127061053.16775-1-kkartik@nvidia.com Signed-off-by: Thierry Reding <treding@nvidia.com>
2025-01-10soc/tegra: Fix spelling error in tegra234_lookup_slave_timeout()liujing
Fix spelling error in tegra234_lookup_slave_timeout(). Signed-off-by: liujing <liujing@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20241209055148.3749-1-liujing@cmss.chinamobile.com Signed-off-by: Thierry Reding <treding@nvidia.com>
2025-01-10arm64: tegra: Fix Tegra234 PCIe interrupt-mapBrad Griffis
For interrupt-map entries, the DTS specification requires that #address-cells is defined for both the child node and the interrupt parent. For the PCIe interrupt-map entries, the parent node ("gic") has not specified #address-cells. The existing layout of the PCIe interrupt-map entries indicates that it assumes that #address-cells is zero for this node. Explicitly set #address-cells to zero for "gic" so that it complies with the device tree specification. NVIDIA EDK2 works around this issue by assuming #address-cells is zero in this scenario, but that workaround is being removed and so this update is needed or else NVIDIA EDK2 cannot successfully parse the device tree and the board cannot boot. Fixes: ec142c44b026 ("arm64: tegra: Add P2U and PCIe controller nodes to Tegra234 DT") Signed-off-by: Brad Griffis <bgriffis@nvidia.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20241213235602.452303-1-bgriffis@nvidia.com Signed-off-by: Thierry Reding <treding@nvidia.com>
2025-01-10perf report: Fix misleading help message about --demangleJiachen Zhang
The wrong help message may mislead users. This commit fixes it. Fixes: 328ccdace8855289 ("perf report: Add --no-demangle option") Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Jiachen Zhang <me@jcix.top> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250109152220.1869581-1-me@jcix.top Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10perf ftrace: Fix display for range of the first bucketNamhyung Kim
When min_latency is not given, it prints 0 - 0. It should be 0 - 1. Before: $ sudo ./perf ftrace latency -a -T do_futex sleep 1 # DURATION | COUNT | GRAPH | 0 - 0 us | 321 | ########### | ... After: $ sudo ./perf ftrace latency -a -T do_futex sleep 1 # DURATION | COUNT | GRAPH | 0 - 1 us | 699 | ############ | ... Fixes: 08b875b6bf608589 ("perf ftrace latency: Introduce --min-latency to narrow down into a latency range") Reviewed-by: Gabriele Monaco <gmonaco@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Gabriele Monaco <gmonaco@redhat.com Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250108210015.1188531-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10perf ftrace: Check min/max latency only with bucket rangeNamhyung Kim
It's an optional feature and remains 0 when bucket range is not given. And it makes the histogram goes to the last entry always because any latency (num) is greater than or equal to 0. Before: $ sudo ./perf ftrace latency -a -T do_futex sleep 1 # DURATION | COUNT | GRAPH | 0 - 0 us | 0 | | 1 - 2 us | 0 | | 2 - 4 us | 0 | | 4 - 8 us | 0 | | 8 - 16 us | 0 | | 16 - 32 us | 0 | | 32 - 64 us | 0 | | 64 - 128 us | 0 | | 128 - 256 us | 0 | | 256 - 512 us | 0 | | 512 - 1024 us | 0 | | 1 - 2 ms | 0 | | 2 - 4 ms | 0 | | 4 - 8 ms | 0 | | 8 - 16 ms | 0 | | 16 - 32 ms | 0 | | 32 - 64 ms | 0 | | 64 - 128 ms | 0 | | 128 - 256 ms | 0 | | 256 - 512 ms | 0 | | 512 - 1024 ms | 0 | | 1 - ... s | 1353 | ############################################## | After: $ sudo ./perf ftrace latency -a -T do_futex sleep 1 # DURATION | COUNT | GRAPH | 0 - 0 us | 321 | ########### | 1 - 2 us | 132 | #### | 2 - 4 us | 202 | ####### | 4 - 8 us | 188 | ###### | 8 - 16 us | 16 | | 16 - 32 us | 12 | | 32 - 64 us | 30 | # | 64 - 128 us | 98 | ### | 128 - 256 us | 53 | # | 256 - 512 us | 57 | ## | 512 - 1024 us | 9 | | 1 - 2 ms | 9 | | 2 - 4 ms | 1 | | 4 - 8 ms | 98 | ### | 8 - 16 ms | 5 | | 16 - 32 ms | 7 | | 32 - 64 ms | 32 | # | 64 - 128 ms | 10 | | 128 - 256 ms | 10 | | 256 - 512 ms | 2 | | 512 - 1024 ms | 0 | | 1 - ... s | 0 | | Fixes: 690a052a6d85c530 ("perf ftrace latency: Add --max-latency option") Reviewed-by: Gabriele Monaco <gmonaco@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Gabriele Monaco <gmonaco@redhat.com Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250108210015.1188531-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10of: Correct child specifier used as input of the 2nd nexus nodeZijun Hu
API of_parse_phandle_with_args_map() will use wrong input for nexus node Nexus_2 as shown below: Node_1 Nexus_1 Nexus_2 &Nexus_1,arg_1 -> arg_1,&Nexus_2,arg_2' -> &Nexus_2,arg_2 -> arg_2,... map-pass-thru=<...> Nexus_1's output arg_2 should be used as input of Nexus_2, but the API wrongly uses arg_2' instead which != arg_2 due to Nexus_1's map-pass-thru. Fix by always making @match_array point to @initial_match_array into which to store nexus output. Fixes: bd6f2fd5a1d5 ("of: Support parsing phandle argument lists through a nexus node") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20250109-of_core_fix-v4-1-db8a72415b8c@quicinc.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2025-01-10Merge tag 'v6.13-rockchip-dtsfixes1' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into arm/fixes Fixed card-detect on one board and some missing properties added. * tag 'v6.13-rockchip-dtsfixes1' of https://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: arm64: dts: rockchip: add hevc power domain clock to rk3328 arm64: dts: rockchip: Fix the SD card detection on NanoPi R6C/R6S arm64: dts: rockchip: rename rfkill label for Radxa ROCK 5B arm64: dts: rockchip: add reset-names for combphy on rk3568 Link: https://lore.kernel.org/r/2914560.yaVYbkx8dN@diego Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-01-10perf: map pages in advanceLorenzo Stoakes
We are adjusting struct page to make it smaller, removing unneeded fields which correctly belong to struct folio. Two of those fields are page->index and page->mapping. Perf is currently making use of both of these. This is unnecessary. This patch eliminates this. Perf establishes its own internally controlled memory-mapped pages using vm_ops hooks. The first page in the mapping is the read/write user control page, and the rest of the mapping consists of read-only pages. The VMA is backed by kernel memory either from the buddy allocator or vmalloc depending on configuration. It is intended to be mapped read/write, but because it has a page_mkwrite() hook, vma_wants_writenotify() indicates that it should be mapped read-only. When a write fault occurs, the provided page_mkwrite() hook, perf_mmap_fault() (doing double duty handing faults as well) uses the vmf->pgoff field to determine if this is the first page, allowing for the desired read/write first page, read-only rest mapping. For this to work the implementation has to carefully work around faulting logic. When a page is write-faulted, the fault() hook is called first, then its page_mkwrite() hook is called (to allow for dirty tracking in file systems). On fault we set the folio's mapping in perf_mmap_fault(), this is because when do_page_mkwrite() is subsequently invoked, it treats a missing mapping as an indicator that the fault should be retried. We also set the folio's index so, given the folio is being treated as faux user memory, it correctly references its offset within the VMA. This explains why the mapping and index fields are used - but it's not necessary. We preallocate pages when perf_mmap() is called for the first time via rb_alloc(), and further allocate auxiliary pages via rb_aux_alloc() as needed if the mapping requires it. This allocation is done in the f_ops->mmap() hook provided in perf_mmap(), and so we can instead simply map all the memory right away here - there's no point in handling (read) page faults when we don't demand page nor need to be notified about them (perf does not). This patch therefore changes this logic to map everything when the mmap() hook is called, establishing a PFN map. It implements vm_ops->pfn_mkwrite() to provide the required read/write vs. read-only behaviour, which does not require the previously implemented workarounds. While it is not ideal to use a VM_PFNMAP here, doing anything else will result in the page_mkwrite() hook need to be provided, which requires the same page->mapping hack this patch seeks to undo. It will also result in the pages being treated as folios and placed on the rmap, which really does not make sense for these mappings. Semantically it makes sense to establish this as some kind of special mapping, as the pages are managed by perf and are not strictly user pages, but currently the only means by which we can do so functionally while maintaining the required R/W and R/O behaviour is a PFN map. There should be no change to actual functionality as a result of this change. Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20250103153151.124163-1-lorenzo.stoakes@oracle.com
2025-01-10perf/x86/intel/uncore: Support more units on Granite RapidsKan Liang
The same CXL PMONs support is also avaiable on GNR. Apply spr_uncore_cxlcm and spr_uncore_cxldp to GNR as well. The other units were broken on early HW samples, so they were ignored in the early enabling patch. The issue has been fixed and verified on the later production HW. Add UPI, B2UPI, B2HOT, PCIEX16 and PCIEX8 for GNR. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Eric Hu <eric.hu@intel.com> Link: https://lkml.kernel.org/r/20250108143017.1793781-2-kan.liang@linux.intel.com