summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-01-10sched_ext: Use time helpers in BPF schedulersChangwoo Min
Modify the BPF schedulers to use time helpers defined in common.bpf.h Signed-off-by: Changwoo Min <changwoo@igalia.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-10sched_ext: Replace bpf_ktime_get_ns() to scx_bpf_now()Changwoo Min
In the BPF schedulers that use bpf_ktime_get_ns() -- scx_central and scx_flatcg, replace bpf_ktime_get_ns() calls to scx_bpf_now(). Signed-off-by: Changwoo Min <changwoo@igalia.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-10sched_ext: Add time helpers for BPF schedulersChangwoo Min
The following functions are added for BPF schedulers: - time_delta(after, before) - time_after(a, b) - time_before(a, b) - time_after_eq(a, b) - time_before_eq(a, b) - time_in_range(a, b, c) - time_in_range_open(a, b, c) Signed-off-by: Changwoo Min <changwoo@igalia.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-10sched_ext: Add scx_bpf_now() for BPF schedulerChangwoo Min
scx_bpf_now() is added to the header files so the BPF scheduler can use it. Signed-off-by: Changwoo Min <changwoo@igalia.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-10sched_ext: Implement scx_bpf_now()Changwoo Min
Returns a high-performance monotonically non-decreasing clock for the current CPU. The clock returned is in nanoseconds. It provides the following properties: 1) High performance: Many BPF schedulers call bpf_ktime_get_ns() frequently to account for execution time and track tasks' runtime properties. Unfortunately, in some hardware platforms, bpf_ktime_get_ns() -- which eventually reads a hardware timestamp counter -- is neither performant nor scalable. scx_bpf_now() aims to provide a high-performance clock by using the rq clock in the scheduler core whenever possible. 2) High enough resolution for the BPF scheduler use cases: In most BPF scheduler use cases, the required clock resolution is lower than the most accurate hardware clock (e.g., rdtsc in x86). scx_bpf_now() basically uses the rq clock in the scheduler core whenever it is valid. It considers that the rq clock is valid from the time the rq clock is updated (update_rq_clock) until the rq is unlocked (rq_unpin_lock). 3) Monotonically non-decreasing clock for the same CPU: scx_bpf_now() guarantees the clock never goes backward when comparing them in the same CPU. On the other hand, when comparing clocks in different CPUs, there is no such guarantee -- the clock can go backward. It provides a monotonically *non-decreasing* clock so that it would provide the same clock values in two different scx_bpf_now() calls in the same CPU during the same period of when the rq clock is valid. An rq clock becomes valid when it is updated using update_rq_clock() and invalidated when the rq is unlocked using rq_unpin_lock(). Let's suppose the following timeline in the scheduler core: T1. rq_lock(rq) T2. update_rq_clock(rq) T3. a sched_ext BPF operation T4. rq_unlock(rq) T5. a sched_ext BPF operation T6. rq_lock(rq) T7. update_rq_clock(rq) For [T2, T4), we consider that rq clock is valid (SCX_RQ_CLK_VALID is set), so scx_bpf_now() calls during [T2, T4) (including T3) will return the rq clock updated at T2. For duration [T4, T7), when a BPF scheduler can still call scx_bpf_now() (T5), we consider the rq clock is invalid (SCX_RQ_CLK_VALID is unset at T4). So when calling scx_bpf_now() at T5, we will return a fresh clock value by calling sched_clock_cpu() internally. Also, to prevent getting outdated rq clocks from a previous scx scheduler, invalidate all the rq clocks when unloading a BPF scheduler. One example of calling scx_bpf_now(), when the rq clock is invalid (like T5), is in scx_central [1]. The scx_central scheduler uses a BPF timer for preemptive scheduling. In every msec, the timer callback checks if the currently running tasks exceed their timeslice. At the beginning of the BPF timer callback (central_timerfn in scx_central.bpf.c), scx_central gets the current time. When the BPF timer callback runs, the rq clock could be invalid, the same as T5. In this case, scx_bpf_now() returns a fresh clock value rather than returning the old one (T2). [1] https://github.com/sched-ext/scx/blob/main/scheds/c/scx_central.bpf.c Signed-off-by: Changwoo Min <changwoo@igalia.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-10sched_ext: Relocate scx_enabled() related codeChangwoo Min
scx_enabled() will be used in scx_rq_clock_update/invalidate() in the following patch, so relocate the scx_enabled() related code to the proper location. Signed-off-by: Changwoo Min <changwoo@igalia.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-10soc/tegra: fuse: Update Tegra234 nvmem keepout listKartik Rajput
Various Nvidia userspace applications and tests access following fuse via Fuse nvmem interface: * odmid * odminfo * boot_security_info * public_key_hash * reserved_odm0 * reserved_odm1 * reserved_odm2 * reserved_odm3 * reserved_odm4 * reserved_odm5 * reserved_odm6 * reserved_odm7 * odm_lock * pk_h1 * pk_h2 * revoke_pk_h0 * revoke_pk_h1 * security_mode * system_fw_field_ratchet0 * system_fw_field_ratchet1 * system_fw_field_ratchet2 * system_fw_field_ratchet3 * optin_enable Update tegra234_fuse_keepouts list to allow reading these fuse from nvmem sysfs interface. Signed-off-by: Kartik Rajput <kkartik@nvidia.com> Link: https://lore.kernel.org/r/20241127061053.16775-1-kkartik@nvidia.com Signed-off-by: Thierry Reding <treding@nvidia.com>
2025-01-10soc/tegra: Fix spelling error in tegra234_lookup_slave_timeout()liujing
Fix spelling error in tegra234_lookup_slave_timeout(). Signed-off-by: liujing <liujing@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20241209055148.3749-1-liujing@cmss.chinamobile.com Signed-off-by: Thierry Reding <treding@nvidia.com>
2025-01-10arm64: tegra: Fix Tegra234 PCIe interrupt-mapBrad Griffis
For interrupt-map entries, the DTS specification requires that #address-cells is defined for both the child node and the interrupt parent. For the PCIe interrupt-map entries, the parent node ("gic") has not specified #address-cells. The existing layout of the PCIe interrupt-map entries indicates that it assumes that #address-cells is zero for this node. Explicitly set #address-cells to zero for "gic" so that it complies with the device tree specification. NVIDIA EDK2 works around this issue by assuming #address-cells is zero in this scenario, but that workaround is being removed and so this update is needed or else NVIDIA EDK2 cannot successfully parse the device tree and the board cannot boot. Fixes: ec142c44b026 ("arm64: tegra: Add P2U and PCIe controller nodes to Tegra234 DT") Signed-off-by: Brad Griffis <bgriffis@nvidia.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20241213235602.452303-1-bgriffis@nvidia.com Signed-off-by: Thierry Reding <treding@nvidia.com>
2025-01-10perf report: Fix misleading help message about --demangleJiachen Zhang
The wrong help message may mislead users. This commit fixes it. Fixes: 328ccdace8855289 ("perf report: Add --no-demangle option") Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Jiachen Zhang <me@jcix.top> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250109152220.1869581-1-me@jcix.top Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10perf ftrace: Fix display for range of the first bucketNamhyung Kim
When min_latency is not given, it prints 0 - 0. It should be 0 - 1. Before: $ sudo ./perf ftrace latency -a -T do_futex sleep 1 # DURATION | COUNT | GRAPH | 0 - 0 us | 321 | ########### | ... After: $ sudo ./perf ftrace latency -a -T do_futex sleep 1 # DURATION | COUNT | GRAPH | 0 - 1 us | 699 | ############ | ... Fixes: 08b875b6bf608589 ("perf ftrace latency: Introduce --min-latency to narrow down into a latency range") Reviewed-by: Gabriele Monaco <gmonaco@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Gabriele Monaco <gmonaco@redhat.com Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250108210015.1188531-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10perf ftrace: Check min/max latency only with bucket rangeNamhyung Kim
It's an optional feature and remains 0 when bucket range is not given. And it makes the histogram goes to the last entry always because any latency (num) is greater than or equal to 0. Before: $ sudo ./perf ftrace latency -a -T do_futex sleep 1 # DURATION | COUNT | GRAPH | 0 - 0 us | 0 | | 1 - 2 us | 0 | | 2 - 4 us | 0 | | 4 - 8 us | 0 | | 8 - 16 us | 0 | | 16 - 32 us | 0 | | 32 - 64 us | 0 | | 64 - 128 us | 0 | | 128 - 256 us | 0 | | 256 - 512 us | 0 | | 512 - 1024 us | 0 | | 1 - 2 ms | 0 | | 2 - 4 ms | 0 | | 4 - 8 ms | 0 | | 8 - 16 ms | 0 | | 16 - 32 ms | 0 | | 32 - 64 ms | 0 | | 64 - 128 ms | 0 | | 128 - 256 ms | 0 | | 256 - 512 ms | 0 | | 512 - 1024 ms | 0 | | 1 - ... s | 1353 | ############################################## | After: $ sudo ./perf ftrace latency -a -T do_futex sleep 1 # DURATION | COUNT | GRAPH | 0 - 0 us | 321 | ########### | 1 - 2 us | 132 | #### | 2 - 4 us | 202 | ####### | 4 - 8 us | 188 | ###### | 8 - 16 us | 16 | | 16 - 32 us | 12 | | 32 - 64 us | 30 | # | 64 - 128 us | 98 | ### | 128 - 256 us | 53 | # | 256 - 512 us | 57 | ## | 512 - 1024 us | 9 | | 1 - 2 ms | 9 | | 2 - 4 ms | 1 | | 4 - 8 ms | 98 | ### | 8 - 16 ms | 5 | | 16 - 32 ms | 7 | | 32 - 64 ms | 32 | # | 64 - 128 ms | 10 | | 128 - 256 ms | 10 | | 256 - 512 ms | 2 | | 512 - 1024 ms | 0 | | 1 - ... s | 0 | | Fixes: 690a052a6d85c530 ("perf ftrace latency: Add --max-latency option") Reviewed-by: Gabriele Monaco <gmonaco@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Gabriele Monaco <gmonaco@redhat.com Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250108210015.1188531-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10Merge tag 'v6.13-rockchip-dtsfixes1' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into arm/fixes Fixed card-detect on one board and some missing properties added. * tag 'v6.13-rockchip-dtsfixes1' of https://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: arm64: dts: rockchip: add hevc power domain clock to rk3328 arm64: dts: rockchip: Fix the SD card detection on NanoPi R6C/R6S arm64: dts: rockchip: rename rfkill label for Radxa ROCK 5B arm64: dts: rockchip: add reset-names for combphy on rk3568 Link: https://lore.kernel.org/r/2914560.yaVYbkx8dN@diego Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-01-10perf: map pages in advanceLorenzo Stoakes
We are adjusting struct page to make it smaller, removing unneeded fields which correctly belong to struct folio. Two of those fields are page->index and page->mapping. Perf is currently making use of both of these. This is unnecessary. This patch eliminates this. Perf establishes its own internally controlled memory-mapped pages using vm_ops hooks. The first page in the mapping is the read/write user control page, and the rest of the mapping consists of read-only pages. The VMA is backed by kernel memory either from the buddy allocator or vmalloc depending on configuration. It is intended to be mapped read/write, but because it has a page_mkwrite() hook, vma_wants_writenotify() indicates that it should be mapped read-only. When a write fault occurs, the provided page_mkwrite() hook, perf_mmap_fault() (doing double duty handing faults as well) uses the vmf->pgoff field to determine if this is the first page, allowing for the desired read/write first page, read-only rest mapping. For this to work the implementation has to carefully work around faulting logic. When a page is write-faulted, the fault() hook is called first, then its page_mkwrite() hook is called (to allow for dirty tracking in file systems). On fault we set the folio's mapping in perf_mmap_fault(), this is because when do_page_mkwrite() is subsequently invoked, it treats a missing mapping as an indicator that the fault should be retried. We also set the folio's index so, given the folio is being treated as faux user memory, it correctly references its offset within the VMA. This explains why the mapping and index fields are used - but it's not necessary. We preallocate pages when perf_mmap() is called for the first time via rb_alloc(), and further allocate auxiliary pages via rb_aux_alloc() as needed if the mapping requires it. This allocation is done in the f_ops->mmap() hook provided in perf_mmap(), and so we can instead simply map all the memory right away here - there's no point in handling (read) page faults when we don't demand page nor need to be notified about them (perf does not). This patch therefore changes this logic to map everything when the mmap() hook is called, establishing a PFN map. It implements vm_ops->pfn_mkwrite() to provide the required read/write vs. read-only behaviour, which does not require the previously implemented workarounds. While it is not ideal to use a VM_PFNMAP here, doing anything else will result in the page_mkwrite() hook need to be provided, which requires the same page->mapping hack this patch seeks to undo. It will also result in the pages being treated as folios and placed on the rmap, which really does not make sense for these mappings. Semantically it makes sense to establish this as some kind of special mapping, as the pages are managed by perf and are not strictly user pages, but currently the only means by which we can do so functionally while maintaining the required R/W and R/O behaviour is a PFN map. There should be no change to actual functionality as a result of this change. Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20250103153151.124163-1-lorenzo.stoakes@oracle.com
2025-01-10perf/x86/intel/uncore: Support more units on Granite RapidsKan Liang
The same CXL PMONs support is also avaiable on GNR. Apply spr_uncore_cxlcm and spr_uncore_cxldp to GNR as well. The other units were broken on early HW samples, so they were ignored in the early enabling patch. The issue has been fixed and verified on the later production HW. Add UPI, B2UPI, B2HOT, PCIEX16 and PCIEX8 for GNR. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Eric Hu <eric.hu@intel.com> Link: https://lkml.kernel.org/r/20250108143017.1793781-2-kan.liang@linux.intel.com
2025-01-10perf/x86/intel/uncore: Clean up func_idKan Liang
The below warning may be triggered on GNR when the PCIE uncore units are exposed. WARNING: CPU: 4 PID: 1 at arch/x86/events/intel/uncore.c:1169 uncore_pci_pmu_register+0x158/0x190 The current uncore driver assumes that all the devices in the same PMU have the exact same devfn. It's true for the previous platforms. But it doesn't work for the new PCIE uncore units on GNR. The assumption doesn't make sense. There is no reason to limit the devices from the same PMU to the same devfn. Also, the current code just throws the warning, but still registers the device. The WARN_ON_ONCE() should be removed. The func_id is used by the later event_init() to check if a event->pmu has valid devices. For cpu and mmio uncore PMUs, they are always valid. For pci uncore PMUs, it's set when the PMU is registered. It can be replaced by the pmu->registered. Clean up the func_id. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Eric Hu <eric.hu@intel.com> Link: https://lkml.kernel.org/r/20250108143017.1793781-1-kan.liang@linux.intel.com
2025-01-10MAINTAINERS: Add static_call_inline.c to STATIC BRANCH/CALLJiri Slaby (SUSE)
Commit 8fd4ddda2f49 ("static_call: Don't make __static_call_return0 static") split static_call.c and created static_call_inline.c. This was not reflected in MAINTAINERS. Fix it by changing the MAINTAINERS line to be a glob: static_call*.c. Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20250109114703.426577-1-jirislaby@kernel.org
2025-01-10cleanup, tags: Create tags for the cleanup primitivesPeter Zijlstra
Oleg reported that it is hard to find the definition of things like: __free(argv) without having to do 'git grep "DEFINE_FREE(argv,"'. Add tag generation for the various macros in cleanup.h. Notably 'DEFINE_FREE(argv, ...)' will now generate a 'cleanup_argv' tag, while all the others, eg. 'DEFINE_GUARD(mutex, ...)' will generate 'class_mutex' like tags. Reported-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20250106102647.GB20870@noisy.programming.kicks-ass.net
2025-01-10drm/amd/display: 3.2.316Ryan Seto
This version brings along following fixes: - Add some feature for secure display - Add replay desync error count tracking and reset - Update chip_cap defines and usage - Remove unnecessary eDP power down - Fix some stuttering/corruption issue on PSR panel - Cleanup and refactoring DML2.1 Acked-by: Wayne Lin <wayne.lin@amd.com> Reviewed-by: Martin Leung <martin.leung@amd.com> Signed-off-by: Ryan Seto <ryanseto@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10drm/amd/display: avoid reset DTBCLK at clock initCharlene Liu
[why & how] this is to init to HW real DTBCLK. and use real HW DTBCLK status to update internal logic state Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Martin Leung <martin.leung@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Ausef Yousof <Ausef.Yousof@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10drm/amd/display: improve dpia pre-trainPeichen Huang
[WHY] We see unstable DP LL 4.2.1.3 test result with dpia pre-train. It is because the outbox interrupt mechanism can not handle HPD immediately and require some improvement. [HOW] 1. not enable link if hpd_pending is true. 2. abort pre-train if training failed and hpd_pending is true. 3. check if 2 lane supported when it is alt mode Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Reviewed-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com> Signed-off-by: Peichen Huang <PeiChen.Huang@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10drm/amd/display: Apply DML21 PatchesAustin Zheng
[Why & How] Add several DML21 fixes Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Austin Zheng <Austin.Zheng@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10drm/amd/display: Use HW lock mgr for PSR1Tom Chung
[Why] Without the dmub hw lock, it may cause the lock timeout issue while do modeset on PSR1 eDP panel. [How] Allow dmub hw lock for PSR1. Reviewed-by: Sun peng Li <sunpeng.li@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10drm/amd/display: Revised for Replay Pseudo vblank controlDennis Chan
[why & how] Revised Replay Full screen video Pseudo vblank control. Reviewed-by: Allen Li <allen.li@amd.com> Signed-off-by: Dennis Chan <dennis.chan@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10drm/amd/display: Add a new flag for replay low hzRobin Chen
[Why & How] Add a new flag in replay_config to indicate the replay low hz status. Reviewed-by: Allen Li <allen.li@amd.com> Signed-off-by: Robin Chen <robin.chen@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10drm/amd/display: Remove unused read_ono_state function from Hwss moduleKarthi Kandasamy
[Why] The functions read_ono_state are no longer in use and have been identified as redundant. Removing them helps streamline the codebase and improve maintainability by eliminating unnecessary code. [How] These unused functions were removed from Hwss module, ensuring that no functionality is affected, and the code is simplified. Reviewed-by: Martin Leung <martin.leung@amd.com> Signed-off-by: Karthi Kandasamy <karthi.kandasamy@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10Merge tag 'vfs-6.13-rc7.fixes.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: "afs: - Fix the maximum cell name length - Fix merge preference rule failure condition fuse: - Fix fuse_get_user_pages() so it doesn't risk misleading the caller to think pages have been allocated when they actually haven't - Fix direct-io folio offset and length calculation netfs: - Fix async direct-io handling - Fix read-retry for filesystems that don't provide a ->prepare_read() method vfs: - Prevent truncating 64-bit offsets to 32-bits in iomap - Fix memory barrier interactions when polling - Remove MNT_ONRB to fix concurrent modification of @mnt->mnt_flags leading to MNT_ONRB to not be raised and invalid access to a list member" * tag 'vfs-6.13-rc7.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: poll: kill poll_does_not_wait() sock_poll_wait: kill the no longer necessary barrier after poll_wait() io_uring_poll: kill the no longer necessary barrier after poll_wait() poll_wait: kill the obsolete wait_address check poll_wait: add mb() to fix theoretical race between waitqueue_active() and .poll() afs: Fix merge preference rule failure condition netfs: Fix read-retry for fs with no ->prepare_read() netfs: Fix kernel async DIO fs: kill MNT_ONRB iomap: avoid avoid truncating 64-bit offset to 32 bits afs: Fix the maximum cell name length fuse: Set *nbytesp=0 in fuse_get_user_pages on allocation failure fuse: fix direct io folio offset and length calculation
2025-01-10drm/amd/display: Do not elevate mem_type change to full updateLeo Li
[Why] There should not be any need to revalidate bandwidth on memory placement change, since the fb is expected to be pinned to DCN-accessable memory before scanout. For APU it's DRAM, and DGPU, it's VRAM. However, async flips + memory type change needs to be rejected. [How] Do not set lock_and_validation_needed on mem_type change. Instead, reject an async_flip request if the crtc's buffer(s) changed mem_type. This may fix stuttering/corruption experienced with PSR SU and PSR1 panels, if the compositor allocates fbs in both VRAM carveout and GTT and flips between them. Fixes: a7c0cad0dc06 ("drm/amd/display: ensure async flips are only accepted for fast updates") Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10spi: Add spi_mem_calc_op_duration() helperMark Brown
Merge series from Miquel Raynal <miquel.raynal@bootlin.com>: Add a spi_mem_calc_op_duration() helper
2025-01-10drm/amd/display: Do not wait for PSR disable on vbl enableLeo Li
[Why] Outside of a modeset/link configuration change, we should not have to wait for the panel to exit PSR. Depending on the panel and it's state, it may take multiple frames for it to exit PSR. Therefore, waiting in all scenarios may cause perceived stuttering, especially in combination with faster vblank shutdown. [How] PSR1 disable is hooked up to the vblank enable event, and vice versa. In case of vblank enable, do not wait for panel to exit PSR, but still wait in all other cases. We also avoid a call to unnecessarily change power_opts on disable - this ends up sending another command to dmcub fw. When testing against IGT, some crc tests like kms_plane_alpha_blend and amd_hotplug were failing due to CRC timeouts. This was found to be caused by the early return before HW has fully exited PSR1. Fix this by first making sure we grab a vblank reference, then waiting for panel to exit PSR1, before programming hw for CRC generation. Fixes: 58a261bfc967 ("drm/amd/display: use a more lax vblank enable policy for older ASICs") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3743 Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10drm/amd/display: Remove unnecessary eDP power downYiling Chen
[why] When first time of link training is fail, eDP would be powered down and would not be powered up for next retry link training. It causes that all of retry link linking would be fail. [how] We has extracted both power up and down sequence from enable/disable link output function before DCN32. We remov eDP power down in dcn32_disable_link_output(). Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Yiling Chen <yi-ling.chen2@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10Revert "drm/amd/display: Enable urgent latency adjustments for DCN35"Nicholas Susanto
Revert commit 284f141f5ce5 ("drm/amd/display: Enable urgent latency adjustments for DCN35") [Why & How] Urgent latency increase caused 2.8K OLED monitor caused it to block this panel support P0. Reverting this change does not reintroduce the netflix corruption issue which it fixed. Fixes: 284f141f5ce5 ("drm/amd/display: Enable urgent latency adjustments for DCN35") Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Nicholas Susanto <Nicholas.Susanto@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10drm/amd/display: Add SMU interface to get UMC count for dcn401Dillon Varone
[WHY&HOW] BIOS table will not always contain accurate UMC channel info when harvesting is enabled, so get the correct info from SMU. Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10Merge tag 'xfs-fixes-6.13-rc7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs fixes from Carlos Maiolino: - Fix a missing lock while detaching a dquot buffer - Fix failure on xfs_update_last_rtgroup_size for !XFS_RT * tag 'xfs-fixes-6.13-rc7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: lock dquot buffer before detaching dquot from b_li_list xfs: don't return an error from xfs_update_last_rtgroup_size for !XFS_RT
2025-01-10drm/amd/display: Initialize denominator defaults to 1Alex Hung
[WHAT & HOW] Variables, used as denominators and maybe not assigned to other values, should be initialized to non-zero to avoid DIVIDE_BY_ZERO, as reported by Coverity. Reviewed-by: Austin Zheng <austin.zheng@amd.com> Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10drm/amd/display: Extend secure display to support DisplayCRC modeWayne Lin
[Why] For the legacy secure display, it involves PSP + DMUB to confgiure and retrieve the CRC/ROI result. Have requirement to support mode which all handled by driver only. [How] Add another "DisplayCRC" mode, which doesn't involve PSP + DMUB. All things are handled by the driver only Reviewed-by: HaoPing Liu <haoping.liu@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10drm/amd/display: Add support to configure CRC window on specific CRC instanceWayne Lin
[Why] Have the need to specify the CRC window on specific CRC engine. dc_stream_configure_crc() today calculates CRC on crc engine 0 only and always resets CRC engine at first. [How] Add index parameter to dc_stream_configure_crc() for selecting the desired crc engine. Additionally, add another parameter to specify whether to skip the default reset of crc engine. Reviewed-by: HaoPing Liu <haoping.liu@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10drm/amd/display: Reduce accessing remote DPCD overheadWayne Lin
[Why] Observed frame rate get dropped by tool like glxgear. Even though the output to monitor is 60Hz, the rendered frame rate drops to 30Hz lower. It's due to code path in some cases will trigger dm_dp_mst_is_port_support_mode() to read out remote Link status to assess the available bandwidth for dsc maniplation. Overhead of keep reading remote DPCD is considerable. [How] Store the remote link BW in mst_local_bw and use end-to-end full_pbn as an indicator to decide whether update the remote link bw or not. Whenever we need the info to assess the BW, visit the stored one first. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3720 Fixes: fa57924c76d9 ("drm/amd/display: Refactor function dm_dp_mst_is_port_support_mode()") Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jerry Zuo <jerry.zuo@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10drm/amd/display: Validate mdoe under MST LCT=1 case as wellWayne Lin
[Why & How] Currently in dm_dp_mst_is_port_support_mode(), when valdidating mode under dsc decoding at the last DP link config, we only validate the case when there is an UFP. However, if the MSTB LCT=1, there is no UFP. Under this case, use root_link_bw_in_kbps as the available bw to compare. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3720 Fixes: fa57924c76d9 ("drm/amd/display: Refactor function dm_dp_mst_is_port_support_mode()") Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jerry Zuo <jerry.zuo@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10drm/amd/display: DML2.1 Post-Si CleanupRafal Ostrowski
[Why] There are a few cleanup and refactoring tasks that need to be done with the DML2.1 wrapper and DC interface to remove dependencies on legacy structures and N-1 prototypes. [How] Implemented pipe_ctx->global_sync. Implemented new functions to use pipe_ctx->hubp_regs and pipe_ctx->global_sync: - hubp_setup2 - hubp_setup_interdependent2 - Several other new functions for DCN 4.01 to support newer structures Removed dml21_update_pipe_ctx_dchub_regs Removed dml21_extract_legacy_watermark_set Removed dml21_populate_pipe_ctx_dlg_param Removed outdated dcn references in DML2.1 wrapper. Reviewed-by: Austin Zheng <austin.zheng@amd.com> Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Rafal Ostrowski <rostrows@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10drm/amd/display: Update chip_cap defines and usageMichael Strauss
[WHY] The defines have also been updated with prefix AMD_ and atomfirmware.h has been temporarily updated with both sets of defines to allow the transition. This update is being made to standardize workaround chip_cap flags, in order to support more workaround flags in the future. [HOW] Updated EXT_DISPLAY_PATH_CAPS__DP_FIXED_VS_EN define, the flag is now an enum masked by EXT_DISPLAY_PATH_CAPS__EXT_CHIP_MASK. All checks for DP_FIXED_VS_EN are now performed by masking with EXT_CHIP_MASK and checking for an exact match rather than the previous bitwise AND check. Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Michael Strauss <michael.strauss@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10drm/amd/display: [FW Promotion] Release 0.0.248.0Taimur Hassan
Refactoring some flags for replay Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10drm/amd/display: Add replay desync error count tracking and reset functionalityJack Chang
[Why & How] Build-up get/reset desync error count interface and implement the functions. Reviewed-by: ChunTao Tso <chuntao.tso@amd.com> Reviewed-by: Robin Chen <robin.chen@amd.com> Signed-off-by: Jack Chang <jack.chang@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10drm/amd/display: Log Hard Min Clocks and Phantom Pipe StatusSung Lee
[WHY] On entering/exiting idle power, certain parameters would be very useful to know for power profiling purposes. [HOW] This commit adds certain hard min clocks and pipe types to log output on idle optimization enter/exit. Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Sung Lee <Sung.Lee@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10drm/amd/display: Limit Scaling Ratio on DCN3.01Gabe Teeger
[why] Underflow and flickering was occuring due to high scaling ratios when resizing videos. [how] Limit the scaling ratios by increasing the max scaling factor Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Gabe Teeger <Gabe.Teeger@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10drm/amdgpu/smu13: update powersave optimizationsAlex Deucher
Only apply when compute profile is selected. This is the only supported configuration. Selecting other profiles can lead to performane degradations. Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10drm/amd/display: add CEC notifier to amdgpu driverKun Liu
This patch adds the cec_notifier feature to amdgpu driver. The changes will allow amdgpu driver code to notify EDID and HPD changes to an eventual CEC adapter. Signed-off-by: Kun Liu <Kun.Liu2@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-10pstore/zone: avoid dereferencing zero sized ptr after init zonesEugen Hristev
In psz_init_zones, if the requested area has a total_size less than record_size, kcalloc will be called with c == 0 and will return ZERO_SIZE_PTR. Further, this will lead to an oops. With this patch, in this scenario, it will look like this : [ 6.865545] pstore_zone: total size : 28672 Bytes [ 6.865547] pstore_zone: kmsg size : 65536 Bytes [ 6.865549] pstore_zone: pmsg size : 0 Bytes [ 6.865551] pstore_zone: console size : 0 Bytes [ 6.865553] pstore_zone: ftrace size : 0 Bytes [ 6.872095] pstore_zone: zone dmesg total_size too small [ 6.878234] pstore_zone: alloc zones failed Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org> Link: https://lore.kernel.org/r/20250110125714.2594719-1-eugen.hristev@linaro.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-01-10binfmt_flat: Fix integer overflow bug on 32 bit systemsDan Carpenter
Most of these sizes and counts are capped at 256MB so the math doesn't result in an integer overflow. The "relocs" count needs to be checked as well. Otherwise on 32bit systems the calculation of "full_data" could be wrong. full_data = data_len + relocs * sizeof(unsigned long); Fixes: c995ee28d29d ("binfmt_flat: prevent kernel dammage from corrupted executable headers") Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Acked-by: Nicolas Pitre <npitre@baylibre.com> Link: https://lore.kernel.org/r/5be17f6c-5338-43be-91ef-650153b975cb@stanley.mountain Signed-off-by: Kees Cook <kees@kernel.org>
2025-01-10ALSA: hda: Fix compilation of snd_hdac_adsp_xxx() helpersCezary Rojewski
The snd_hdac_adsp_xxx() wrap snd_hdac_reg_xxx() helpers to simplify register access for AudioDSP drivers e.g.: the avs-driver. Byte- and word-variants of said helps do not expand to bare readx/writex() operations but functions instead and, due to pointer type incompatibility, cause compilation to fail. As the macros are utilized by the avs-driver alone, relocate the code introduced with commit c19bd02e9029 ("ALSA: hda: Add helper macros for DSP capable devices") into the avs/ directory and update it to operate on 'adev' i.e.: the avs-driver-context directly to fix the issue. Fixes: c19bd02e9029 ("ALSA: hda: Add helper macros for DSP capable devices") Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com> Acked-by: Mark Brown <broonie@kernel.org> Link: https://patch.msgid.link/20250110113326.3809897-2-cezary.rojewski@intel.com Signed-off-by: Takashi Iwai <tiwai@suse.de>