summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-12-10bpf: add find_containing_subprog() utility functionEduard Zingerman
Add a utility function, looking for a subprogram containing a given instruction index, rewrite find_subprog() to use this function. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20241210041100.1898468-2-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-12-10bpf,perf: Fix invalid prog_array access in perf_event_detach_bpf_progJiri Olsa
Syzbot reported [1] crash that happens for following tracing scenario: - create tracepoint perf event with attr.inherit=1, attach it to the process and set bpf program to it - attached process forks -> chid creates inherited event the new child event shares the parent's bpf program and tp_event (hence prog_array) which is global for tracepoint - exit both process and its child -> release both events - first perf_event_detach_bpf_prog call will release tp_event->prog_array and second perf_event_detach_bpf_prog will crash, because tp_event->prog_array is NULL The fix makes sure the perf_event_detach_bpf_prog checks prog_array is valid before it tries to remove the bpf program from it. [1] https://lore.kernel.org/bpf/Z1MR6dCIKajNS6nU@krava/T/#m91dbf0688221ec7a7fc95e896a7ef9ff93b0b8ad Fixes: 0ee288e69d03 ("bpf,perf: Fix perf_event_detach_bpf_prog error handling") Reported-by: syzbot+2e0d2840414ce817aaac@syzkaller.appspotmail.com Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241208142507.1207698-1-jolsa@kernel.org
2024-12-10bpf: Fix UAF via mismatching bpf_prog/attachment RCU flavorsJann Horn
Uprobes always use bpf_prog_run_array_uprobe() under tasks-trace-RCU protection. But it is possible to attach a non-sleepable BPF program to a uprobe, and non-sleepable BPF programs are freed via normal RCU (see __bpf_prog_put_noref()). This leads to UAF of the bpf_prog because a normal RCU grace period does not imply a tasks-trace-RCU grace period. Fix it by explicitly waiting for a tasks-trace-RCU grace period after removing the attachment of a bpf_prog to a perf_event. Fixes: 8c7dcb84e3b7 ("bpf: implement sleepable uprobes by chaining gps") Suggested-by: Andrii Nakryiko <andrii@kernel.org> Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/bpf/20241210-bpf-fix-actual-uprobe-uaf-v1-1-19439849dd44@google.com
2024-12-10ARC: bpf: Correct conditional check in 'check_jmp_32'Hardevsinh Palaniya
The original code checks 'if (ARC_CC_AL)', which is always true since ARC_CC_AL is a constant. This makes the check redundant and likely obscures the intention of verifying whether the jump is conditional. Updates the code to check cond == ARC_CC_AL instead, reflecting the intent to differentiate conditional from unconditional jumps. Suggested-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Acked-by: Shahab Vahedi <list+bpf@vahedi.org> Signed-off-by: Hardevsinh Palaniya <hardevsinh.palaniya@siliconsignals.io> Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2024-12-10ARC: dts: Replace deprecated snps,nr-gpios property for ↵Uwe Kleine-König
snps,dw-apb-gpio-port devices snps,dw-apb-gpio-port is deprecated since commit ef42a8da3cf3 ("dt-bindings: gpio: dwapb: Add ngpios property support"). The respective driver supports this since commit 7569486d79ae ("gpio: dwapb: Add ngpios DT-property support") which is included in Linux v5.10-rc1. This change was created using git grep -l snps,nr-gpios arch/arc/boot/dts | xargs perl -p -i -e 's/\bsnps,nr-gpios\b/ngpios/ . Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2024-12-10ARC: build: Use __force to suppress per-CPU cmpxchg warningsPaul E. McKenney
Currently, the cast of the first argument to cmpxchg_emu_u8() drops the __percpu address-space designator, which results in sparse complaints when applying cmpxchg() to per-CPU variables in ARC. Therefore, use __force to suppress these complaints, given that this does not pertain to cmpxchg() semantics, which are plently well-defined on variables in general, whether per-CPU or otherwise. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202409251336.ToC0TvWB-lkp@intel.com/ Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: <linux-snps-arc@lists.infradead.org> Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2024-12-10ARC: fix reference of dependency for PAE40 configLukas Bulwahn
Commit d71e629bed5b ("ARC: build: disallow invalid PAE40 + 4K page config") reworks the build dependencies for ARC_HAS_PAE40, and accidentally refers to the non-existing config option MMU_V4 rather than the intended option ARC_MMU_V4. Note the missing prefix in the name here. Refer to the intended config option in the dependency of the ARC_HAS_PAE40 config. Fixes: d71e629bed5b ("ARC: build: disallow invalid PAE40 + 4K page config") Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com> Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2024-12-10ARC: build: disallow invalid PAE40 + 4K page configVineet Gupta
The config option being built was | CONFIG_ARC_MMU_V4=y | CONFIG_ARC_PAGE_SIZE_4K=y | CONFIG_HIGHMEM=y | CONFIG_ARC_HAS_PAE40=y This was hitting a BUILD_BUG_ON() since a 4K page can't hoist 1k, 8-byte PTE entries (8 byte due to PAE40). BUILD_BUG_ON() is a good last ditch resort, but such a config needs to be disallowed explicitly in Kconfig. Side-note: the actual fix is single liner dependency, but while at it cleaned out a few things: - 4K dependency on MMU v3 or v4 is always true, since 288ff7de62af09 ("ARC: retire MMUv1 and MMUv2 support") - PAE40 dependency in on MMU ver not really ISA, although that follows eventually. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202409160223.xydgucbY-lkp@intel.com/ Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2024-12-10arc: rename aux.h to arc_aux.hBenjamin Szőke
The goal is to clean-up Linux repository from AUX file names, because the use of such file names is prohibited on other operating systems such as Windows, so the Linux repository cannot be cloned and edited on them. Reviewed-by: Shahab Vahedi <list+bpf@vahedi.org> Signed-off-by: Benjamin Szőke <egyszeregy@freemail.hu> Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2024-12-10of/irq: Fix using uninitialized variable @addr_len in API of_irq_parse_one()Zijun Hu
of_irq_parse_one() may use uninitialized variable @addr_len as shown below: // @addr_len is uninitialized int addr_len; // This operation does not touch @addr_len if it fails. addr = of_get_property(device, "reg", &addr_len); // Use uninitialized @addr_len if the operation fails. if (addr_len > sizeof(addr_buf)) addr_len = sizeof(addr_buf); // Check the operation result here. if (addr) memcpy(addr_buf, addr, addr_len); Fix by initializing @addr_len before the operation. Fixes: b739dffa5d57 ("of/irq: Prevent device address out-of-bounds read in interrupt map walk") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20241209-of_irq_fix-v1-4-782f1419c8a1@quicinc.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2024-12-10of/irq: Fix interrupt-map cell length check in of_irq_parse_imap_parent()Zijun Hu
On a malformed interrupt-map property which is shorter than expected by 1 cell, we may read bogus data past the end of the property instead of returning an error in of_irq_parse_imap_parent(). Decrement the remaining length when skipping over the interrupt parent phandle cell. Fixes: 935df1bd40d4 ("of/irq: Factor out parsing of interrupt-map parent phandle+args from of_irq_parse_raw()") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20241209-of_irq_fix-v1-1-782f1419c8a1@quicinc.com [rh: reword commit msg] Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2024-12-10of: Fix refcount leakage for OF node returned by __of_get_dma_parent()Zijun Hu
__of_get_dma_parent() returns OF device node @args.np, but the node's refcount is increased twice, by both of_parse_phandle_with_args() and of_node_get(), so causes refcount leakage for the node. Fix by directly returning the node got by of_parse_phandle_with_args(). Fixes: f83a6e5dea6c ("of: address: Add support for the parent DMA bus") Cc: stable@vger.kernel.org Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20241206-of_core_fix-v1-4-dc28ed56bec3@quicinc.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2024-12-10selftests/bpf: Extend test for sockmap update with sameMichal Luczaj
Verify that the sockmap link was not severed, and socket's entry is indeed removed from the map when the corresponding descriptor gets closed. Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20241202-sockmap-replace-v1-2-1e88579e7bd5@rbox.co
2024-12-10bpf, sockmap: Fix race between element replace and close()Michal Luczaj
Element replace (with a socket different from the one stored) may race with socket's close() link popping & unlinking. __sock_map_delete() unconditionally unrefs the (wrong) element: // set map[0] = s0 map_update_elem(map, 0, s0) // drop fd of s0 close(s0) sock_map_close() lock_sock(sk) (s0!) sock_map_remove_links(sk) link = sk_psock_link_pop() sock_map_unlink(sk, link) sock_map_delete_from_link // replace map[0] with s1 map_update_elem(map, 0, s1) sock_map_update_elem (s1!) lock_sock(sk) sock_map_update_common psock = sk_psock(sk) spin_lock(&stab->lock) osk = stab->sks[idx] sock_map_add_link(..., &stab->sks[idx]) sock_map_unref(osk, &stab->sks[idx]) psock = sk_psock(osk) sk_psock_put(sk, psock) if (refcount_dec_and_test(&psock)) sk_psock_drop(sk, psock) spin_unlock(&stab->lock) unlock_sock(sk) __sock_map_delete spin_lock(&stab->lock) sk = *psk // s1 replaced s0; sk == s1 if (!sk_test || sk_test == sk) // sk_test (s0) != sk (s1); no branch sk = xchg(psk, NULL) if (sk) sock_map_unref(sk, psk) // unref s1; sks[idx] will dangle psock = sk_psock(sk) sk_psock_put(sk, psock) if (refcount_dec_and_test()) sk_psock_drop(sk, psock) spin_unlock(&stab->lock) release_sock(sk) Then close(map) enqueues bpf_map_free_deferred, which finally calls sock_map_free(). This results in some refcount_t warnings along with a KASAN splat [1]. Fix __sock_map_delete(), do not allow sock_map_unref() on elements that may have been replaced. [1]: BUG: KASAN: slab-use-after-free in sock_map_free+0x10e/0x330 Write of size 4 at addr ffff88811f5b9100 by task kworker/u64:12/1063 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Not tainted 6.12.0+ #125 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred Call Trace: <TASK> dump_stack_lvl+0x68/0x90 print_report+0x174/0x4f6 kasan_report+0xb9/0x190 kasan_check_range+0x10f/0x1e0 sock_map_free+0x10e/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> Allocated by task 1202: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 __kasan_slab_alloc+0x85/0x90 kmem_cache_alloc_noprof+0x131/0x450 sk_prot_alloc+0x5b/0x220 sk_alloc+0x2c/0x870 unix_create1+0x88/0x8a0 unix_create+0xc5/0x180 __sock_create+0x241/0x650 __sys_socketpair+0x1ce/0x420 __x64_sys_socketpair+0x92/0x100 do_syscall_64+0x93/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e Freed by task 46: kasan_save_stack+0x1e/0x40 kasan_save_track+0x10/0x30 kasan_save_free_info+0x37/0x60 __kasan_slab_free+0x4b/0x70 kmem_cache_free+0x1a1/0x590 __sk_destruct+0x388/0x5a0 sk_psock_destroy+0x73e/0xa50 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 The buggy address belongs to the object at ffff88811f5b9080 which belongs to the cache UNIX-STREAM of size 1984 The buggy address is located 128 bytes inside of freed 1984-byte region [ffff88811f5b9080, ffff88811f5b9840) The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x11f5b8 head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 memcg:ffff888127d49401 flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff) page_type: f5(slab) raw: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000 raw: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401 head: 0017ffffc0000040 ffff8881042e4500 dead000000000122 0000000000000000 head: 0000000000000000 00000000800f000f 00000001f5000000 ffff888127d49401 head: 0017ffffc0000003 ffffea00047d6e01 ffffffffffffffff 0000000000000000 head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88811f5b9000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88811f5b9080: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88811f5b9180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88811f5b9200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Disabling lock debugging due to kernel taint refcount_t: addition on 0; use-after-free. WARNING: CPU: 14 PID: 1063 at lib/refcount.c:25 refcount_warn_saturate+0xce/0x150 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G B 6.12.0+ #125 Tainted: [B]=BAD_PAGE Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred RIP: 0010:refcount_warn_saturate+0xce/0x150 Code: 34 73 eb 03 01 e8 82 53 ad fe 0f 0b eb b1 80 3d 27 73 eb 03 00 75 a8 48 c7 c7 80 bd 95 84 c6 05 17 73 eb 03 01 e8 62 53 ad fe <0f> 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05 RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: 0000000000000002 R08: 0000000000000001 R09: ffffed10bcde6349 R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000 R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024 FS: 0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0 PKRU: 55555554 Call Trace: <TASK> ? __warn.cold+0x5f/0x1ff ? refcount_warn_saturate+0xce/0x150 ? report_bug+0x1ec/0x390 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xce/0x150 sock_map_free+0x2e5/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 10741 hardirqs last enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770 softirqs last enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 refcount_t: underflow; use-after-free. WARNING: CPU: 14 PID: 1063 at lib/refcount.c:28 refcount_warn_saturate+0xee/0x150 CPU: 14 UID: 0 PID: 1063 Comm: kworker/u64:12 Tainted: G B W 6.12.0+ #125 Tainted: [B]=BAD_PAGE, [W]=WARN Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Workqueue: events_unbound bpf_map_free_deferred RIP: 0010:refcount_warn_saturate+0xee/0x150 Code: 17 73 eb 03 01 e8 62 53 ad fe 0f 0b eb 91 80 3d 06 73 eb 03 00 75 88 48 c7 c7 e0 bd 95 84 c6 05 f6 72 eb 03 01 e8 42 53 ad fe <0f> 0b e9 6e ff ff ff 80 3d e6 72 eb 03 00 0f 85 61 ff ff ff 48 c7 RSP: 0018:ffff88815c49fc70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff88811f5b9100 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: 0000000000000003 R08: 0000000000000001 R09: ffffed10bcde6349 R10: ffff8885e6f31a4b R11: 0000000000000000 R12: ffff88813be0b000 R13: ffff88811f5b9100 R14: ffff88811f5b9080 R15: ffff88813be0b024 FS: 0000000000000000(0000) GS:ffff8885e6f00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055dda99b0250 CR3: 000000015dbac000 CR4: 0000000000752ef0 PKRU: 55555554 Call Trace: <TASK> ? __warn.cold+0x5f/0x1ff ? refcount_warn_saturate+0xee/0x150 ? report_bug+0x1ec/0x390 ? handle_bug+0x58/0x90 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0xee/0x150 sock_map_free+0x2d3/0x330 bpf_map_free_deferred+0x173/0x320 process_one_work+0x846/0x1420 worker_thread+0x5b3/0xf80 kthread+0x29e/0x360 ret_from_fork+0x2d/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> irq event stamp: 10741 hardirqs last enabled at (10741): [<ffffffff84400ec6>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (10740): [<ffffffff811e532d>] handle_softirqs+0x60d/0x770 softirqs last enabled at (10506): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 softirqs last disabled at (10301): [<ffffffff811e55a9>] __irq_exit_rcu+0x109/0x210 Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20241202-sockmap-replace-v1-3-1e88579e7bd5@rbox.co
2024-12-10bpf, sockmap: Fix update element with sameMichal Luczaj
Consider a sockmap entry being updated with the same socket: osk = stab->sks[idx]; sock_map_add_link(psock, link, map, &stab->sks[idx]); stab->sks[idx] = sk; if (osk) sock_map_unref(osk, &stab->sks[idx]); Due to sock_map_unref(), which invokes sock_map_del_link(), all the psock's links for stab->sks[idx] are torn: list_for_each_entry_safe(link, tmp, &psock->link, list) { if (link->link_raw == link_raw) { ... list_del(&link->list); sk_psock_free_link(link); } } And that includes the new link sock_map_add_link() added just before the unref. This results in a sockmap holding a socket, but without the respective link. This in turn means that close(sock) won't trigger the cleanup, i.e. a closed socket will not be automatically removed from the sockmap. Stop tearing the links when a matching link_raw is found. Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20241202-sockmap-replace-v1-1-1e88579e7bd5@rbox.co
2024-12-10cpufreq/amd-pstate: Use boost numerator for upper bound of frequenciesMario Limonciello
commit 18d9b5227121 ("cpufreq/amd-pstate: Use nominal perf for limits when boost is disabled") introduced different semantics for min/max limits based upon whether the user turned off boost from sysfs. This however is not necessary when the highest perf value is the boost numerator. Suggested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Fixes: 18d9b5227121 ("cpufreq/amd-pstate: Use nominal perf for limits when boost is disabled") Link: https://lore.kernel.org/r/20241209185248.16301-3-mario.limonciello@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-10cpufreq/amd-pstate: Store the boost numerator as highest perf againMario Limonciello
commit ad4caad58d91d ("cpufreq: amd-pstate: Merge amd_pstate_highest_perf_set() into amd_get_boost_ratio_numerator()") changed the semantics for highest perf and commit 18d9b52271213 ("cpufreq/amd-pstate: Use nominal perf for limits when boost is disabled") worked around those semantic changes. This however is a confusing result and furthermore makes it awkward to change frequency limits and boost due to the scaling differences. Restore the boost numerator to highest perf again. Suggested-by: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Fixes: ad4caad58d91 ("cpufreq: amd-pstate: Merge amd_pstate_highest_perf_set() into amd_get_boost_ratio_numerator()") Link: https://lore.kernel.org/r/20241209185248.16301-2-mario.limonciello@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
2024-12-10blk-cgroup: Fix UAF in blkcg_unpin_online()Tejun Heo
blkcg_unpin_online() walks up the blkcg hierarchy putting the online pin. To walk up, it uses blkcg_parent(blkcg) but it was calling that after blkcg_destroy_blkgs(blkcg) which could free the blkcg, leading to the following UAF: ================================================================== BUG: KASAN: slab-use-after-free in blkcg_unpin_online+0x15a/0x270 Read of size 8 at addr ffff8881057678c0 by task kworker/9:1/117 CPU: 9 UID: 0 PID: 117 Comm: kworker/9:1 Not tainted 6.13.0-rc1-work-00182-gb8f52214c61a-dirty #48 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS unknown 02/02/2022 Workqueue: cgwb_release cgwb_release_workfn Call Trace: <TASK> dump_stack_lvl+0x27/0x80 print_report+0x151/0x710 kasan_report+0xc0/0x100 blkcg_unpin_online+0x15a/0x270 cgwb_release_workfn+0x194/0x480 process_scheduled_works+0x71b/0xe20 worker_thread+0x82a/0xbd0 kthread+0x242/0x2c0 ret_from_fork+0x33/0x70 ret_from_fork_asm+0x1a/0x30 </TASK> ... Freed by task 1944: kasan_save_track+0x2b/0x70 kasan_save_free_info+0x3c/0x50 __kasan_slab_free+0x33/0x50 kfree+0x10c/0x330 css_free_rwork_fn+0xe6/0xb30 process_scheduled_works+0x71b/0xe20 worker_thread+0x82a/0xbd0 kthread+0x242/0x2c0 ret_from_fork+0x33/0x70 ret_from_fork_asm+0x1a/0x30 Note that the UAF is not easy to trigger as the free path is indirected behind a couple RCU grace periods and a work item execution. I could only trigger it with artifical msleep() injected in blkcg_unpin_online(). Fix it by reading the parent pointer before destroying the blkcg's blkg's. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Abagail ren <renzezhongucas@gmail.com> Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Fixes: 4308a434e5e0 ("blkcg: don't offline parent blkcg first") Cc: stable@vger.kernel.org # v5.7+ Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-10MAINTAINERS: update Coly Li's email addressColy Li
This patch updates Coly Li's email addres to colyli@kernel.org. Signed-off-by: Coly Li <colyli@kernel.org> Link: https://lore.kernel.org/r/20241208115350.85103-1-colyli@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-10block: Prevent potential deadlocks in zone write plug error recoveryDamien Le Moal
Zone write plugging for handling writes to zones of a zoned block device always execute a zone report whenever a write BIO to a zone fails. The intent of this is to ensure that the tracking of a zone write pointer is always correct to ensure that the alignment to a zone write pointer of write BIOs can be checked on submission and that we can always correctly emulate zone append operations using regular write BIOs. However, this error recovery scheme introduces a potential deadlock if a device queue freeze is initiated while BIOs are still plugged in a zone write plug and one of these write operation fails. In such case, the disk zone write plug error recovery work is scheduled and executes a report zone. This in turn can result in a request allocation in the underlying driver to issue the report zones command to the device. But with the device queue freeze already started, this allocation will block, preventing the report zone execution and the continuation of the processing of the plugged BIOs. As plugged BIOs hold a queue usage reference, the queue freeze itself will never complete, resulting in a deadlock. Avoid this problem by completely removing from the zone write plugging code the use of report zones operations after a failed write operation, instead relying on the device user to either execute a report zones, reset the zone, finish the zone, or give up writing to the device (which is a fairly common pattern for file systems which degrade to read-only after write failures). This is not an unreasonnable requirement as all well-behaved applications, FSes and device mapper already use report zones to recover from write errors whenever possible by comparing the current position of a zone write pointer with what their assumption about the position is. The changes to remove the automatic error recovery are as follows: - Completely remove the error recovery work and its associated resources (zone write plug list head, disk error list, and disk zone_wplugs_work work struct). This also removes the functions disk_zone_wplug_set_error() and disk_zone_wplug_clear_error(). - Change the BLK_ZONE_WPLUG_ERROR zone write plug flag into BLK_ZONE_WPLUG_NEED_WP_UPDATE. This new flag is set for a zone write plug whenever a write opration targetting the zone of the zone write plug fails. This flag indicates that the zone write pointer offset is not reliable and that it must be updated when the next report zone, reset zone, finish zone or disk revalidation is executed. - Modify blk_zone_write_plug_bio_endio() to set the BLK_ZONE_WPLUG_NEED_WP_UPDATE flag for the target zone of a failed write BIO. - Modify the function disk_zone_wplug_set_wp_offset() to clear this new flag, thus implementing recovery of a correct write pointer offset with the reset (all) zone and finish zone operations. - Modify blkdev_report_zones() to always use the disk_report_zones_cb() callback so that disk_zone_wplug_sync_wp_offset() can be called for any zone marked with the BLK_ZONE_WPLUG_NEED_WP_UPDATE flag. This implements recovery of a correct write pointer offset for zone write plugs marked with BLK_ZONE_WPLUG_NEED_WP_UPDATE and within the range of the report zones operation executed by the user. - Modify blk_revalidate_seq_zone() to call disk_zone_wplug_sync_wp_offset() for all sequential write required zones when a zoned block device is revalidated, thus always resolving any inconsistency between the write pointer offset of zone write plugs and the actual write pointer position of sequential zones. Fixes: dd291d77cc90 ("block: Introduce zone write plugging") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20241209122357.47838-5-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-10dm: Fix dm-zoned-reclaim zone write pointer alignmentDamien Le Moal
The zone reclaim processing of the dm-zoned device mapper uses blkdev_issue_zeroout() to align the write pointer of a zone being used for reclaiming another zone, to write the valid data blocks from the zone being reclaimed at the same position relative to the zone start in the reclaim target zone. The first call to blkdev_issue_zeroout() will try to use hardware offload using a REQ_OP_WRITE_ZEROES operation if the device reports a non-zero max_write_zeroes_sectors queue limit. If this operation fails because of the lack of hardware support, blkdev_issue_zeroout() falls back to using a regular write operation with the zero-page as buffer. Currently, such REQ_OP_WRITE_ZEROES failure is automatically handled by the block layer zone write plugging code which will execute a report zones operation to ensure that the write pointer of the target zone of the failed operation has not changed and to "rewind" the zone write pointer offset of the target zone as it was advanced when the write zero operation was submitted. So the REQ_OP_WRITE_ZEROES failure does not cause any issue and blkdev_issue_zeroout() works as expected. However, since the automatic recovery of zone write pointers by the zone write plugging code can potentially cause deadlocks with queue freeze operations, a different recovery must be implemented in preparation for the removal of zone write plugging report zones based recovery. Do this by introducing the new function blk_zone_issue_zeroout(). This function first calls blkdev_issue_zeroout() with the flag BLKDEV_ZERO_NOFALLBACK to intercept failures on the first execution which attempt to use the device hardware offload with the REQ_OP_WRITE_ZEROES operation. If this attempt fails, a report zone operation is issued to restore the zone write pointer offset of the target zone to the correct position and blkdev_issue_zeroout() is called again without the BLKDEV_ZERO_NOFALLBACK flag. The report zones operation performing this recovery is implemented using the helper function disk_zone_sync_wp_offset() which calls the gendisk report_zones file operation with the callback disk_report_zones_cb(). This callback updates the target write pointer offset of the target zone using the new function disk_zone_wplug_sync_wp_offset(). dmz_reclaim_align_wp() is modified to change its call to blkdev_issue_zeroout() to a call to blk_zone_issue_zeroout() without any other change needed as the two functions are functionnally equivalent. Fixes: dd291d77cc90 ("block: Introduce zone write plugging") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20241209122357.47838-4-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-10block: Ignore REQ_NOWAIT for zone reset and zone finish operationsDamien Le Moal
There are currently any issuer of REQ_OP_ZONE_RESET and REQ_OP_ZONE_FINISH operations that set REQ_NOWAIT. However, as we cannot handle this flag correctly due to the potential request allocation failure that may happen in blk_mq_submit_bio() after blk_zone_plug_bio() has handled the zone write plug write pointer updates for the targeted zones, modify blk_zone_wplug_handle_reset_or_finish() to warn if this flag is set and ignore it. Fixes: dd291d77cc90 ("block: Introduce zone write plugging") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20241209122357.47838-3-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-10block: Use a zone write plug BIO work for REQ_NOWAIT BIOsDamien Le Moal
For zoned block devices, a write BIO issued to a zone that has no on-going writes will be prepared for execution and allowed to execute immediately by blk_zone_wplug_handle_write() (called from blk_zone_plug_bio()). However, if this BIO specifies REQ_NOWAIT, the allocation of a request for its execution in blk_mq_submit_bio() may fail after blk_zone_plug_bio() completed, marking the target zone of the BIO as plugged. When this BIO is retried later on, it will be blocked as the zone write plug of the target zone is in a plugged state without any on-going write operation (completion of write operations trigger unplugging of the next write BIOs for a zone). This leads to a BIO that is stuck in a zone write plug and never completes, which results in various issues such as hung tasks. Avoid this problem by always executing REQ_NOWAIT write BIOs using the BIO work of a zone write plug. This ensure that we never block the BIO issuer and can thus safely ignore the REQ_NOWAIT flag when executing the BIO from the zone write plug BIO work. Since such BIO may be the first write BIO issued to a zone with no on-going write, modify disk_zone_wplug_add_bio() to schedule the zone write plug BIO work if the write plug is not already marked with the BLK_ZONE_WPLUG_PLUGGED flag. This scheduling is otherwise not necessary as the completion of the on-going write for the zone will schedule the execution of the next plugged BIOs. blk_zone_wplug_handle_write() is also fixed to better handle zone write plug allocation failures for REQ_NOWAIT BIOs by failing a write BIO using bio_wouldblock_error() instead of bio_io_error(). Reported-by: Bart Van Assche <bvanassche@acm.org> Fixes: dd291d77cc90 ("block: Introduce zone write plugging") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20241209122357.47838-2-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-10drm/amdkfd: Uninitialized pointer readAndrew Martin
This a pointer that is being passed into other functions, so it is best to initialize it to NULL prior. Signed-off-by: Andrew Martin <Andrew.Martin@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/display: Remove unused dcn_find_dcfclk_suits_allDr. David Alan Gilbert
dcn_find_dcfclk_suits_all() last use was removed by 2018's commit 4fd994c448a3 ("drm/amd/display: Start using the new pp_smu interface") Remove it, and the dcn_find_normalized_clock_vdd_Level helper it used. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/display: Remove unused mmhubbub_warmup fieldDr. David Alan Gilbert
mmhubbub_warmup is a field that was only read by the just removed dc_stream_warmup_writeback() function. Remove the field and it's initialisers. It was only ever initialised to a single function value (dcn30_mmhubbub_warmup) which is called explicitly elsewhere. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/display: Remove unused dc_stream_warmup_writebackDr. David Alan Gilbert
dc_stream_warmup_writeback() is unused since it was added in 2019 by commit 6a652f6d127d ("drm/amd/display: Add warmup escape call support") Remove it. Note there is a dcn30 version that's called directly which is kept. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/display: Remove unused dwb3_set_host_read_rate_controlDr. David Alan Gilbert
dwb3_set_host_read_rate_control() has been unused since it was added by commit 8993dee0de2a ("drm/amd/display: Add DCN3 DWB") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/display: Remove unused enable_surface_flip_reportingDr. David Alan Gilbert
enable_surface_flip_reporting() has been unused since it was added by commit 4562236b3bc0 ("drm/amd/dc: Add dc display driver (v2)") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: rename register headers to dcn_2_0_1Leo Li
They were named with the incorrect dcn version. Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Sun peng Li <sunpeng.li@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/display: 3.2.313Aric Cyr
* Fix some regressions related to IPS2 and PSR Panel Replay * Bug fixes in DML * DMCUB debug improvements * Other refactors and improvements across multiple components Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/display: [FW Promotion] Release 0.0.246.0Taimur Hassan
Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/display: update dcn351 used clock offsetCharlene Liu
[why] hw register offset delta Reviewed-by: Martin Leung <martin.leung@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/display: remove clearance code of force_ffu_mode flag in ↵Zhongwei
dmub_psr_copy_settings() [Why/How] The force_ffu_mode flag could be initialized at other place. Reviewed-by: Robin Chen <robin.chen@amd.com> Signed-off-by: Zhongwei <Zhongwei.Zhang@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10Revert "drm/amd/display: Don't allow IPS2 in D0 for RCG Dynamic"Nicholas Kazlauskas
This reverts commit 8488646966fe. In some test environments causes reporting failures for S0i3/S4. It shouldn't actually block entry provided there's no race with the last state being updated, but currently suspecting there's an IPS2 check that's no longer being met. Fixes: 8488646966fe ("drm/amd/display: Don't allow IPS2 in D0 for RCG Dynamic") Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10Revert "drm/amd/display: Revised for Replay Pseudo vblank"Gabe Teeger
This reverts commit 0f5ac8c8e275 Due to a replay regression. Fixes: 0f5ac8c8e275 ("drm/amd/display: Revised for Replay Pseudo vblank control") Reviewed-by: Dennis Chan <dennis.chan@amd.com> Signed-off-by: Gabe Teeger <Gabe.Teeger@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/display: Update color space, bias and scale programming sequenceChris Park
[Why] DMColor inaccurately updates color space, bias and scale destructively in dc_plane_state. This can be resolved by accurately populating the infos on dc_plane_info where then translation to plane state can happen as a whole surface update sequence. [How] Remove dc_plane_state update in DMColor and update color space, bias and scale on dc_plane_info. Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Chris Park <chris.park@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/display: Use resource_build_scaling_params for dcn20Peterson
[WHY] When using upscaling on certain gpus, some incorrect scaling calculations would be made causing hangs. [HOW] This was fixed by using the resource_build_scaling_params function on these gpus. Reviewed-by: Dillon Varone <dillon.varone@amd.com> Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Peterson Guo <peterson.guo@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/display: Overwriting dualDPP UBF values before usageAusef Yousof
[WHY] Right now in dml2 mode validation we are calculating UBF parameters for prefetch calculation for single and dual DPP scenarios. Data structure to store such values are just 1D arrays, the single DPP values are overwritten by the dualDPP values, and we end up using dualDPP for prefetch calculations twice (once in place of singleDPP support check and again for dual). This naturally leads to many problems, one of which validating a mode in "singleDPP" (when we used dual DPP parameters) and sending the singleDPP parameters to mode programming, if we cannot support then we observe the corruption as described in the ticket. [HOW] UBF values need to have 2d arrays to store values specific to single and dual DPP states to avoid single DPP values being overwritten. Other parameters are recorded on a per state basis such as prefetch UBF values but they are in the same loop used for calculation and at that point its fine to overwrite them, its not the case for plain UBF values. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Ausef Yousof <Ausef.Yousof@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/display: Make DMCUB tracebuffer debugfs chronologicalLeo Li
[Why] Previously, the debugfs did a simple dump of the tracebuffer region. Because the tracebuffer is a ring, it meant that the entries printed may not be in chronological order if the ring rolled over. This makes parsing the tracelog cumbersome. [How] Since dmcub provides the current entry count, use that to determine the latest tracelog entry and output the log chronologically. Also, the fb region size is not accurate of the actual tracebuffer size; it has been padded to alignment requirements. Use the tracebuffer size reported by the fw meta_info, if available. If not, a fallback to the hardcoded default is needed. To make this value available to other .c files, its define was moved to dmub_srv.h. Also, print a indicator at the start of the log if rollover occurred. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/display: clean up SPL codeSamson Tam
[Why & How] Add check for invalid pixel format, remove unused pixel formats and clean up some names Reviewed-by: Navid Assadian <navid.assadian@amd.com> Signed-off-by: Samson Tam <Samson.Tam@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/display: DML21 Update Prefetch CalculationsAustin Zheng
[Why/How] Mismatch between mode support and mode programming occurs. Mode support would calculate higher row vblank than mode programming. As a result, mode programming fails and hardware isn't properly programmed. Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Austin Zheng <Austin.Zheng@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/display: Adjust secure_display_context data structureWayne Lin
[Why] Variables relates to secure display are spreading out within struct amdgpu_display_manager. [How] Encapsulate relevant variables into struct secure_display_context and adjust relevant affected codes. Reviewed-by: HaoPing Liu <haoping.liu@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/display: Fix phy id mapping issue for secure displayWayne Lin
[Why] Under mst scenario, mst streams are from the same link_enc_hw_inst. As the result, can't utilize that as the phy index for distinguising different stream sinks. [How] Sort the connectors by: link_enc_hw_instance->mst tree depth->mst RAD After sorting the phy index assignment, store connector's relevant info into dm mapping array. Once need the index, just look up the static array. Reviewed-by: HaoPing Liu <haoping.liu@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/display: Adjust dc_stream_forward_crc_window to accept assignment of ↵Wayne Lin
phy_id [Why] For mst streams under same topology, stream->link->link_enc_hw_inst are the same and hence can't distinguish the crc window setting. [How] Firstly adjust dc_stream_forward_crc_window to accept assignment of phy_id. Follow up another patch to determine the phy_id at dm layer. Reviewed-by: HaoPing Liu <haoping.liu@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/display: Refactor dcn31_panel_construct to avoid assertJoshua Aberback
[Why] We want to avoid unnecessary asserts, one of which is hit in dcn31_panel_construct when booting on a DCN32 asic that has an eDP connector on a different DIG than A or B. The DIG-based mapping only applies when edp0_on_dp1 is supported, therefore the check for valid eng_id can be moved within the appropriate section of the if statement. Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Joshua Aberback <joshua.aberback@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/display: expose DCN401 HUBP functionsKarthi Kandasamy
[Why] Expose DCN401 HUBP functions for use across other platforms. [Description] This change aims to make the DCN401 HUBP functions accessible for enabling their use in future platform developments. Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Karthi Kandasamy <karthi.kandasamy@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/display: populate VABC support in DMCUBHarry VanZyllDeJong
[HOW&WHY] Stores DMUB support for enablement of Varibright over VABC in DCN32 Reviewed-by: Aric Cyr <aric.cyr@amd.com> Reviewed-by: Iswara Nagulendran <iswara.nagulendran@amd.com> Signed-off-by: Harry VanZyllDeJong <hvanzyll@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: Support nbif v6_3_1 fatal error handlingCandice Li
Add nbif v6_3_1 fatal error handling support. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd/pm: Revert state if force level failsLijo Lazar
Before forcing level, CG/PG is disabled or enabled depending on the new level. However if the force level operation fails, CG/PG state remains modified. Revert the state change on failure. Also, move invalid operation checks to the beginning before any logic that could change SOC state. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>