summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-05-28drm/vkms: Add range and encoding properties to the planeArthur Grillo
Now that the driver internally handles these quantization ranges and YUV encoding matrices, expose the UAPI for setting them. Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net> [Louis Chauvet: retained only relevant parts, updated the commit message] Acked-by: Pekka Paalanen <pekka.paalanen@collabora.com> Acked-by: Maxime Ripard <mripard@kernel.org> Link: https://lore.kernel.org/r/20250415-yuv-v18-3-f2918f71ec4b@bootlin.com Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
2025-05-28drm/vkms: Add YUV supportArthur Grillo
Add support to the YUV formats bellow: - NV12/NV16/NV24 - NV21/NV61/NV42 - YUV420/YUV422/YUV444 - YVU420/YVU422/YVU444 The conversion from yuv to rgb is done with fixed-point arithmetic, using 32.32 fixed-point numbers and the drm_fixed helpers. To do the conversion, a specific matrix must be used for each color range (DRM_COLOR_*_RANGE) and encoding (DRM_COLOR_*). This matrix is stored in the `conversion_matrix` struct, along with the specific y_offset needed. This matrix is queried only once, in `vkms_plane_atomic_update` and stored in a `vkms_plane_state`. Those conversion matrices of each encoding and range were obtained by rounding the values of the original conversion matrices multiplied by 2^32. This is done to avoid the use of floating point operations. The same reading function is used for YUV and YVU formats. As the only difference between those two category of formats is the order of field, a simple swap in conversion matrix columns allows using the same function. [Louis Chauvet: - Adapted Arthur's work - Implemented the read_line_t callbacks for yuv - add struct conversion_matrix - store the whole conversion_matrix in the plane state - remove struct pixel_yuv_u8 - update the commit message - Merge the modifications from Arthur] Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net> Acked-by: Maxime Ripard <mripard@kernel.org> Link: https://lore.kernel.org/r/20250415-yuv-v18-2-f2918f71ec4b@bootlin.com Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
2025-05-28drm/vkms: Document pixel_argb_u16Louis Chauvet
The meaning of each member of the structure was not specified. To clarify the format used and the reason behind those choices, add some documentation. Reviewed-by: Pekka Paalanen <pekka.paalanen@collabora.com> Link: https://lore.kernel.org/r/20250415-yuv-v18-1-f2918f71ec4b@bootlin.com Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
2025-05-28drm/amdgpu: update trace format to match gpu_scheduler_tracePierre-Eric Pelloux-Prayer
Log fences using the same format for coherency. Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250526125505.2360-11-pierre-eric.pelloux-prayer@amd.com
2025-05-28drm/doc: Document some tracepoints as uAPIPierre-Eric Pelloux-Prayer
This commit adds a document section in drm-uapi.rst about tracepoints, and mark the events gpu_scheduler_trace.h as stable uAPI. The goal is to explicitly state that tools can rely on the fields, formats and semantics of these events. Acked-by: Lucas Stach <l.stach@pengutronix.de> Acked-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250526125505.2360-10-pierre-eric.pelloux-prayer@amd.com
2025-05-28drm: Get rid of drm_sched_job.idPierre-Eric Pelloux-Prayer
Its only purpose was for trace events, but jobs can already be uniquely identified using their fence. The downside of using the fence is that it's only available after 'drm_sched_job_arm' was called which is true for all trace events that used job.id so they can safely switch to using it. Suggested-by: Tvrtko Ursulin <tursulin@igalia.com> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Arvind Yadav <arvind.yadav@amd.com> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250526125505.2360-9-pierre-eric.pelloux-prayer@amd.com
2025-05-28drm/sched: Cleanup event namesPierre-Eric Pelloux-Prayer
All events now start with the same prefix (drm_sched_job_). drm_sched_job_wait_dep was misleading because it wasn't waiting at all. It's now replaced by trace_drm_sched_job_unschedulable, which is only traced if the job cannot be scheduled. For moot dependencies, nothing is traced. Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250526125505.2360-8-pierre-eric.pelloux-prayer@amd.com
2025-05-28drm/sched: Add the drm_client_id to the drm_sched_run/exec_job eventsPierre-Eric Pelloux-Prayer
For processes with multiple drm_file instances, the drm_client_id is the only way to map jobs back to their unique owner. It's even more useful if drm client_name is set, because now a tool can map jobs to the client name instead of only having access to the process name. Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Philipp Stanner <phasta@kernel.org> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250526125505.2360-7-pierre-eric.pelloux-prayer@amd.com
2025-05-28drm/sched: Trace dependencies for GPU jobsPierre-Eric Pelloux-Prayer
We can't trace dependencies from drm_sched_job_add_dependency because when it's called the job's fence is not available yet. So instead each dependency is traced individually when drm_sched_entity_push_job is used. Tracing the dependencies allows tools to analyze the dependencies between the jobs (previously it was only possible for fences traced by drm_sched_job_wait_dep). Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250526125505.2360-6-pierre-eric.pelloux-prayer@amd.com
2025-05-28drm/sched: Cleanup gpu_scheduler trace eventsPierre-Eric Pelloux-Prayer
A fence uniquely identify a job, so this commits updates the places where a kernel pointer was used as an identifier by: "fence=%llu:%llu" Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250526125505.2360-5-pierre-eric.pelloux-prayer@amd.com
2025-05-28drm/sched: Add device name to the drm_sched_process_job eventPierre-Eric Pelloux-Prayer
Since switching the scheduler from using kthreads to workqueues in commit a6149f039369 ("drm/sched: Convert drm scheduler to use a work queue rather than kthread") userspace applications cannot determine the device from the PID of the threads sending the trace events anymore. Each queue had its own kthread which had a given PID for the whole time. So, at least for amdgpu, it was possible to associate a PID to the hardware queues of each GPU in the system. Then, when a drm_run_job trace event was received by userspace, the source PID allowed to associate it back to the correct GPU. With workqueues this is not possible anymore, so the event needs to contain the dev_name() to identify the device. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250526125505.2360-4-pierre-eric.pelloux-prayer@amd.com
2025-05-28drm/sched: Store the drm client_id in drm_sched_fencePierre-Eric Pelloux-Prayer
This will be used in a later commit to trace the drm client_id in some of the gpu_scheduler trace events. This requires changing all the users of drm_sched_job_init to add an extra parameter. The newly added drm_client_id field in the drm_sched_fence is a bit of a duplicate of the owner one. One suggestion I received was to merge those 2 fields - this can't be done right now as amdgpu uses some special values (AMDGPU_FENCE_OWNER_*) that can't really be translated into a client id. Christian is working on getting rid of those; when it's done we should be able to squash owner/drm_client_id together. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250526125505.2360-3-pierre-eric.pelloux-prayer@amd.com
2025-05-28drm/debugfs: Output client_id in in drm_clients_infoPierre-Eric Pelloux-Prayer
client_id is a unique id used by fdinfo. Having it listed in 'clients' output means a userspace application can correlate the fields, eg: given a fdinfo id get the fdinfo name. Geiven that client_id is a uint64_t, we use a %20llu printf format to keep the output aligned (20 = digit count of the biggest uint64_t). Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250526125505.2360-2-pierre-eric.pelloux-prayer@amd.com
2025-05-28drm/bridge: analogix_dp: Fix clk-disable removalHeiko Stuebner
Commit 6579a03e68ff ("drm/bridge: analogix_dp: Remove the unnecessary calls to clk_disable_unprepare() during probing") removed the mismatched clock_disable calls from analogix_dp_probe. But that patch was created and sent before commit e5e9fa9f7aad ("drm/bridge: analogix_dp: Add support to get panel from the DP AUX bus") was merged, so couldn't know about this change. So in the original patch the last change is if (ret) { dev_err(&pdev->dev, "failed to request irq\n"); - goto err_disable_clk; + return ERR_PTR(ret); } disable_irq(dp->irq); return dp; - -err_disable_clk: - clk_disable_unprepare(dp->clock); - return ERR_PTR(ret); } EXPORT_SYMBOL_GPL(analogix_dp_probe); the analogix_dp_core.c actually now has the runtime-pm handling between disable_irq() and return do introducing another goto err_clk_disable there. So remove that one too and return an error pointer, to not create build breakage. Fixes: 6579a03e68ff ("drm/bridge: analogix_dp: Remove the unnecessary calls to clk_disable_unprepare() during probing") Signed-off-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250527225120.3361663-1-heiko@sntech.de Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2025-05-28drm/bridge: adv7511: Rename adv7511_dsi_config_timing_gen() into ↵Tommaso Merciai
adv7533_dsi_config_timing_gen() To preserve the drivers naming convention rename adv7511_dsi_config_timing_gen() into adv7533_dsi_config_timing_gen() Signed-off-by: Tommaso Merciai <tommaso.merciai.xr@bp.renesas.com> Reviewed-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250528070452.901183-3-tommaso.merciai.xr@bp.renesas.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2025-05-28drm/bridge: adv7511: Move adv711_dsi_config_timing_gen() into adv7511_mode_set()Tommaso Merciai
adv7511_mode_set() currently updates only the sync registers of the ADV bridge. At the end, drm_mode_copy() updates the current mode, but the horizontal and vertical porch registers of the ADV bridge still retain values from the old mode. Move adv7511_dsi_config_timing_gen() into adv7511_mode_set() to ensure the horizontal and vertical porch registers are correctly updated. Fixes: ae01d3183d2763ed ("drm/bridge: adv7511: switch to the HDMI connector helpers") Reported-by: Biju Das <biju.das.jz@bp.renesas.com> Closes: https://lore.kernel.org/all/aDB8bD6cF7qiSpKd@tom-desktop/ Signed-off-by: Tommaso Merciai <tommaso.merciai.xr@bp.renesas.com> Reviewed-by: Biju Das <biju.das.jz@bp.renesas.com> Tested-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250528070452.901183-2-tommaso.merciai.xr@bp.renesas.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2025-05-28dt-bindings: gpu: mali-bifrost: Add compatible for RZ/G3E SoCTommaso Merciai
Add a compatible string for the Renesas RZ/G3E SoC variants that include a Mali-G52 GPU. These variants share the same restrictions on interrupts, clocks, and power domains as the RZ/G2L SoC, so extend the existing schema validation accordingly. Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Tommaso Merciai <tommaso.merciai.xr@bp.renesas.com> Link: https://lore.kernel.org/r/20250528073040.904033-1-tommaso.merciai.xr@bp.renesas.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2025-05-28perf test trace_summary: Skip --bpf-summary tests if no libbpfIan Rogers
If perf is built without libbpf (e.g. NO_LIBBPF=1) then the --bpf-summary perf trace tests will fail. Skip the tests as this is expected behavior. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Howard Chu <howardchu95@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Gary Guo <gary@garyguo.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20250528032637.198960-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28perf test intel-pt: Skip jitdump test if no libelfIan Rogers
jitdump support is only present if building with libelf. Skip the intel-pt jitdump test if perf isn't compiled with libelf support. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Gary Guo <gary@garyguo.net> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20250528032637.198960-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28perf intel-tpebs: Avoid race when evlist is being deletedIan Rogers
Reading through the evsel->evlist may seg fault if a sample arrives when the evlist is being deleted. Detect this case and ignore samples arriving when the evlist is being deleted. Fixes: bcfab08db7fb38bf ("perf intel-tpebs: Filter non-workload samples") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Gary Guo <gary@garyguo.net> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20250528032637.198960-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28perf test demangle-java: Don't segv if demangling failsIan Rogers
The buffer returned by dso__demangle_sym() may be NULL, don't segv in strcmp if this happens. Currently this happens for NO_LIBELF=1 builds. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Gary Guo <gary@garyguo.net> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20250528032637.198960-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28perf symbol: Fix use-after-free in filename__read_build_idIan Rogers
The same buf is used for the program headers and reading notes. As the notes memory may be reallocated then this corrupts the memory pointed to by the phdr. Using the same buffer is in any case a logic error. Rather than deal with the duplicated code, introduce an elf32 boolean and a union for either the elf32 or elf64 headers that are in use. Let the program headers have their own memory and grow the buffer for notes as necessary. Before `perf list -j` compiled with asan would crash with: ``` ==4176189==ERROR: AddressSanitizer: heap-use-after-free on address 0x5160000070b8 at pc 0x555d3b15075b bp 0x7ffebb5a8090 sp 0x7ffebb5a8088 READ of size 8 at 0x5160000070b8 thread T0 #0 0x555d3b15075a in filename__read_build_id tools/perf/util/symbol-minimal.c:212:25 #1 0x555d3ae43aff in filename__sprintf_build_id tools/perf/util/build-id.c:110:8 ... 0x5160000070b8 is located 312 bytes inside of 560-byte region [0x516000006f80,0x5160000071b0) freed by thread T0 here: #0 0x555d3ab21840 in realloc (perf+0x264840) (BuildId: 12dff2f6629f738e5012abdf0e90055518e70b5e) #1 0x555d3b1506e7 in filename__read_build_id tools/perf/util/symbol-minimal.c:206:11 ... previously allocated by thread T0 here: #0 0x555d3ab21423 in malloc (perf+0x264423) (BuildId: 12dff2f6629f738e5012abdf0e90055518e70b5e) #1 0x555d3b1503a2 in filename__read_build_id tools/perf/util/symbol-minimal.c:182:9 ... ``` Note: this bug is long standing and not introduced by the other asan fix in commit fa9c4977fbfb ("perf symbol-minimal: Fix double free in filename__read_build_id"). Fixes: b691f64360ecec49 ("perf symbols: Implement poor man's ELF parser") Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250528032637.198960-2-irogers@google.com Cc: Mark Rutland <mark.rutland@arm.com> Cc: Gary Guo <gary@garyguo.net> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28perf pmu: Avoid segv for missing name/alias_name in wildcardingIan Rogers
The pmu name or alias_name fields may be NULL and should be skipped if so. This is done in all loops of perf_pmu___name_match except the final wildcard loop which was an oversight. Fixes: 63e287131cf0c59b ("perf pmu: Rename name matching for no suffix or wildcard variants") Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250527215035.187992-1-irogers@google.com [ Fixup the Fixes: tag to the right commit ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28perf machine: Factor creating a "live" machine out of dwarf-unwindIan Rogers
Factor out for use in places other than the dwarf unwinding tests for libunwind. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anne Macedo <retpolanne@posteo.net> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250313052952.871958-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28ASoC: dt-bindings: qcom,sm8250: Add Fairphone 5 sound cardLuca Weiss
Document the bindings for the sound card on Fairphone 5 which uses the older non-audioreach audio architecture. Acked-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Luca Weiss <luca.weiss@fairphone.com> Link: https://lore.kernel.org/r/20250507-fp5-dp-sound-v4-1-4098e918a29e@fairphone.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2025-05-28s390/uv: Always return 0 from s390_wiggle_split_folio() if successfulDavid Hildenbrand
Let's consistently return 0 if the operation was successful, and just detect ourselves whether splitting is required -- folio_test_large() is a cheap operation. Update the documentation. Should we simply always return -EAGAIN instead of 0, so we don't have to handle it in the caller? Not sure, staring at the documentation, this way looks a bit cleaner. Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20250516123946.1648026-3-david@redhat.com Message-ID: <20250516123946.1648026-3-david@redhat.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
2025-05-28s390/uv: Don't return 0 from make_hva_secure() if the operation was not ↵David Hildenbrand
successful If s390_wiggle_split_folio() returns 0 because splitting a large folio succeeded, we will return 0 from make_hva_secure() even though a retry is required. Return -EAGAIN in that case. Otherwise, we'll return 0 from gmap_make_secure(), and consequently from unpack_one(). In kvm_s390_pv_unpack(), we assume that unpacking succeeded and skip unpacking this page. Later on, we run into issues and fail booting the VM. So far, this issue was only observed with follow-up patches where we split large pagecache XFS folios. Maybe it can also be triggered with shmem? We'll cleanup s390_wiggle_split_folio() a bit next, to also return 0 if no split was required. Fixes: d8dfda5af0be ("KVM: s390: pv: fix race when making a page secure") Cc: stable@vger.kernel.org Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20250516123946.1648026-2-david@redhat.com Message-ID: <20250516123946.1648026-2-david@redhat.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
2025-05-28ASoC: tas571x: fix tas5733 num_controlsBram Vlerick
Commit e3de7984e451 ("ASoC: tas571x: add separate tas5733 controls") introduces a separate struct for the tas5733 controls but did not update the num_controls with the correct ARRAY_SIZE. Fixes: e3de7984e451 ("ASoC: tas571x: add separate tas5733 controls") Signed-off-by: Bram Vlerick <bram.vlerick@openpixelsystems.org> Acked-by: Peter Korsgaard <peter@korsgaard.com> Link: https://patch.msgid.link/20250528-tas5733-fix-controls-size-v1-1-5c70595accaf@openpixelsystems.org Signed-off-by: Mark Brown <broonie@kernel.org>
2025-05-28Merge branch 'kvm-lockdep-common' into HEADPaolo Bonzini
Introduce new mutex locking functions mutex_trylock_nest_lock() and mutex_lock_killable_nest_lock() and use them to clean up locking of all vCPUs for a VM. For x86, this removes some complex code that was used instead of lockdep's "nest_lock" feature. For ARM and RISC-V, this removes a lockdep warning when the VM is configured to have more than MAX_LOCK_DEPTH vCPUs, and removes a fair amount of duplicate code by sharing the logic across all architectures. Signed-off-by: Paolo BOnzini <pbonzini@redhat.com>
2025-05-28rust: add helper for mutex_trylockPaolo Bonzini
After commit c5b6ababd21a ("locking/mutex: implement mutex_trylock_nested", currently in the KVM tree) mutex_trylock() will be a macro when lockdep is enabled. Rust therefore needs the corresponding helper. Just add it and the rust/bindings/bindings_helpers_generated.rs Makefile rules will do their thing. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20250528083431.1875345-1-pbonzini@redhat.com> Acked-by: Miguel Ojeda <ojeda@kernel.org> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-05-28accel/ivpu: Reorder Doorbell Unregister and Command Queue DestructionKarol Wachowski
Refactor ivpu_cmdq_unregister() to ensure the doorbell is unregistered before destroying the command queue. The NPU firmware requires doorbells to be unregistered prior to command queue destruction. If doorbell remains registered when command queue destroy command is sent firmware will automatically unregister the doorbell, making subsequent unregister attempts no-operations (NOPs). Ensure compliance with firmware expectations by moving the doorbell unregister call ahead of the command queue destruction logic, thus preventing unnecessary NOP operation. Fixes: 465a3914b254 ("accel/ivpu: Add API for command queue create/destroy/submit") Signed-off-by: Karol Wachowski <karol.wachowski@intel.com> Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://lore.kernel.org/r/20250515094124.255141-1-jacek.lawrynowicz@linux.intel.com
2025-05-28accel/ivpu: Use firmware names from upstream repoJacek Lawrynowicz
Use FW names from linux-firmware repo instead of deprecated ones. The vpu_37xx.bin style names were never released and were only used for internal testing, so it is safe to remove them. Fixes: c140244f0cfb ("accel/ivpu: Add initial Panther Lake support") Cc: stable@vger.kernel.org # v6.13+ Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://lore.kernel.org/r/20250506092030.280276-1-jacek.lawrynowicz@linux.intel.com
2025-05-28accel/ivpu: Improve buffer object loggingJacek Lawrynowicz
- Fix missing alloc log when drm_gem_handle_create() fails in drm_vma_node_allow() and open callback is not called - Add ivpu_bo->ctx_id that enables to log the actual context id instead of using 0 as default - Add couple WARNs and errors so we can catch more memory corruption issues Fixes: 37dee2a2f433 ("accel/ivpu: Improve buffer object debug logs") Cc: stable@vger.kernel.org # v6.8+ Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://lore.kernel.org/r/20250506091303.262034-1-jacek.lawrynowicz@linux.intel.com
2025-05-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netPaolo Abeni
Merge in late fixes to prepare for the 6.16 net-next PR. No conflicts nor adjacent changes. Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-28selftests/bpf: Fix bpf selftest build warningSaket Kumar Bhaskar
On linux-next, build for bpf selftest displays a warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/if_xdp.h' differs from latest version at 'include/uapi/linux/if_xdp.h'. Commit 8066e388be48 ("net: add UAPI to the header guard in various network headers") changed the header guard from _LINUX_IF_XDP_H to _UAPI_LINUX_IF_XDP_H in include/uapi/linux/if_xdp.h. To resolve the warning, update tools/include/uapi/linux/if_xdp.h to align with the changes in include/uapi/linux/if_xdp.h Fixes: 8066e388be48 ("net: add UAPI to the header guard in various network headers") Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Closes: https://lore.kernel.org/all/c2bc466d-dff2-4d0d-a797-9af7f676c065@linux.ibm.com/ Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://patch.msgid.link/20250527054138.1086006-1-skb99@linux.ibm.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-28selftests: netfilter: Fix skip of wildcard interface testPhil Sutter
The script is supposed to skip wildcard interface testing if unsupported by the host's nft tool. The failing check caused script abort due to 'set -e' though. Fix this by running the potentially failing nft command inside the if-conditional pipe. Fixes: 73db1b5dab6f ("selftests: netfilter: Torture nftables netdev hooks") Signed-off-by: Phil Sutter <phil@nwl.cc> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Link: https://patch.msgid.link/20250527094117.18589-1-phil@nwl.cc Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-28iomap: don't lose folio dropbehind state for overwritesJens Axboe
DONTCACHE I/O must have the completion punted to a workqueue, just like what is done for unwritten extents, as the completion needs task context to perform the invalidation of the folio(s). However, if writeback is started off filemap_fdatawrite_range() off generic_sync() and it's an overwrite, then the DONTCACHE marking gets lost as iomap_add_to_ioend() don't look at the folio being added and no further state is passed down to help it know that this is a dropbehind/DONTCACHE write. Check if the folio being added is marked as dropbehind, and set IOMAP_IOEND_DONTCACHE if that is the case. Then XFS can factor this into the decision making of completion context in xfs_submit_ioend(). Additionally include this ioend flag in the NOMERGE flags, to avoid mixing it with unrelated IO. Since this is the 3rd flag that will cause XFS to punt the completion to a workqueue, add a helper so that each one of them can get appropriately commented. This fixes extra page cache being instantiated when the write performed is an overwrite, rather than newly instantiated blocks. Fixes: b2cd5ae693a3 ("iomap: make buffered writes work with RWF_DONTCACHE") Signed-off-by: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/5153f6e8-274d-4546-bf55-30a5018e0d03@kernel.dk Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-05-28virtio: reject shm region if length is zeroSami Uddin
Prevent usage of shared memory regions where the length is zero, as such configurations are not valid and may lead to unexpected behavior. Signed-off-by: Sami Uddin <sami.md.ko@gmail.com> Message-Id: <20250511222153.2332-1-sami.md.ko@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2025-05-28net: phy: mscc: Stop clearing the the UDPv4 checksum for L2 framesHoratiu Vultur
We have noticed that when PHY timestamping is enabled, L2 frames seems to be modified by changing two 2 bytes with a value of 0. The place were these 2 bytes seems to be random(or I couldn't find a pattern). In most of the cases the userspace can ignore these frames but if for example those 2 bytes are in the correction field there is nothing to do. This seems to happen when configuring the HW for IPv4 even that the flow is not enabled. These 2 bytes correspond to the UDPv4 checksum and once we don't enable clearing the checksum when using L2 frames then the frame doesn't seem to be changed anymore. Fixes: 7d272e63e0979d ("net: phy: mscc: timestamping and PHC support") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Link: https://patch.msgid.link/20250523082716.2935895-1-horatiu.vultur@microchip.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-28net: openvswitch: Fix the dead loop of MPLS parseFaicker Mo
The unexpected MPLS packet may not end with the bottom label stack. When there are many stacks, The label count value has wrapped around. A dead loop occurs, soft lockup/CPU stuck finally. stack backtrace: UBSAN: array-index-out-of-bounds in /build/linux-0Pa0xK/linux-5.15.0/net/openvswitch/flow.c:662:26 index -1 is out of range for type '__be32 [3]' CPU: 34 PID: 0 Comm: swapper/34 Kdump: loaded Tainted: G OE 5.15.0-121-generic #131-Ubuntu Hardware name: Dell Inc. PowerEdge C6420/0JP9TF, BIOS 2.12.2 07/14/2021 Call Trace: <IRQ> show_stack+0x52/0x5c dump_stack_lvl+0x4a/0x63 dump_stack+0x10/0x16 ubsan_epilogue+0x9/0x36 __ubsan_handle_out_of_bounds.cold+0x44/0x49 key_extract_l3l4+0x82a/0x840 [openvswitch] ? kfree_skbmem+0x52/0xa0 key_extract+0x9c/0x2b0 [openvswitch] ovs_flow_key_extract+0x124/0x350 [openvswitch] ovs_vport_receive+0x61/0xd0 [openvswitch] ? kernel_init_free_pages.part.0+0x4a/0x70 ? get_page_from_freelist+0x353/0x540 netdev_port_receive+0xc4/0x180 [openvswitch] ? netdev_port_receive+0x180/0x180 [openvswitch] netdev_frame_hook+0x1f/0x40 [openvswitch] __netif_receive_skb_core.constprop.0+0x23a/0xf00 __netif_receive_skb_list_core+0xfa/0x240 netif_receive_skb_list_internal+0x18e/0x2a0 napi_complete_done+0x7a/0x1c0 bnxt_poll+0x155/0x1c0 [bnxt_en] __napi_poll+0x30/0x180 net_rx_action+0x126/0x280 ? bnxt_msix+0x67/0x80 [bnxt_en] handle_softirqs+0xda/0x2d0 irq_exit_rcu+0x96/0xc0 common_interrupt+0x8e/0xa0 </IRQ> Fixes: fbdcdd78da7c ("Change in Openvswitch to support MPLS label depth of 3 in ingress direction") Signed-off-by: Faicker Mo <faicker.mo@zenlayer.com> Acked-by: Ilya Maximets <i.maximets@ovn.org> Reviewed-by: Aaron Conole <aconole@redhat.com> Link: https://patch.msgid.link/259D3404-575D-4A6D-B263-1DF59A67CF89@zenlayer.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-28calipso: Don't call calipso functions for AF_INET sk.Kuniyuki Iwashima
syzkaller reported a null-ptr-deref in txopt_get(). [0] The offset 0x70 was of struct ipv6_txoptions in struct ipv6_pinfo, so struct ipv6_pinfo was NULL there. However, this never happens for IPv6 sockets as inet_sk(sk)->pinet6 is always set in inet6_create(), meaning the socket was not IPv6 one. The root cause is missing validation in netlbl_conn_setattr(). netlbl_conn_setattr() switches branches based on struct sockaddr.sa_family, which is passed from userspace. However, netlbl_conn_setattr() does not check if the address family matches the socket. The syzkaller must have called connect() for an IPv6 address on an IPv4 socket. We have a proper validation in tcp_v[46]_connect(), but security_socket_connect() is called in the earlier stage. Let's copy the validation to netlbl_conn_setattr(). [0]: Oops: general protection fault, probably for non-canonical address 0xdffffc000000000e: 0000 [#1] PREEMPT SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077] CPU: 2 UID: 0 PID: 12928 Comm: syz.9.1677 Not tainted 6.12.0 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 RIP: 0010:txopt_get include/net/ipv6.h:390 [inline] RIP: 0010: Code: 02 00 00 49 8b ac 24 f8 02 00 00 e8 84 69 2a fd e8 ff 00 16 fd 48 8d 7d 70 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 53 02 00 00 48 8b 6d 70 48 85 ed 0f 84 ab 01 00 RSP: 0018:ffff88811b8afc48 EFLAGS: 00010212 RAX: dffffc0000000000 RBX: 1ffff11023715f8a RCX: ffffffff841ab00c RDX: 000000000000000e RSI: ffffc90007d9e000 RDI: 0000000000000070 RBP: 0000000000000000 R08: ffffed1023715f9d R09: ffffed1023715f9e R10: ffffed1023715f9d R11: 0000000000000003 R12: ffff888123075f00 R13: ffff88810245bd80 R14: ffff888113646780 R15: ffff888100578a80 FS: 00007f9019bd7640(0000) GS:ffff8882d2d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f901b927bac CR3: 0000000104788003 CR4: 0000000000770ef0 PKRU: 80000000 Call Trace: <TASK> calipso_sock_setattr+0x56/0x80 net/netlabel/netlabel_calipso.c:557 netlbl_conn_setattr+0x10c/0x280 net/netlabel/netlabel_kapi.c:1177 selinux_netlbl_socket_connect_helper+0xd3/0x1b0 security/selinux/netlabel.c:569 selinux_netlbl_socket_connect_locked security/selinux/netlabel.c:597 [inline] selinux_netlbl_socket_connect+0xb6/0x100 security/selinux/netlabel.c:615 selinux_socket_connect+0x5f/0x80 security/selinux/hooks.c:4931 security_socket_connect+0x50/0xa0 security/security.c:4598 __sys_connect_file+0xa4/0x190 net/socket.c:2067 __sys_connect+0x12c/0x170 net/socket.c:2088 __do_sys_connect net/socket.c:2098 [inline] __se_sys_connect net/socket.c:2095 [inline] __x64_sys_connect+0x73/0xb0 net/socket.c:2095 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xaa/0x1b0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f901b61a12d Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f9019bd6fa8 EFLAGS: 00000246 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 00007f901b925fa0 RCX: 00007f901b61a12d RDX: 000000000000001c RSI: 0000200000000140 RDI: 0000000000000003 RBP: 00007f901b701505 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 00007f901b5b62a0 R15: 00007f9019bb7000 </TASK> Modules linked in: Fixes: ceba1832b1b2 ("calipso: Set the calipso socket label to match the secattr.") Reported-by: syzkaller <syzkaller@googlegroups.com> Reported-by: John Cheung <john.cs.hey@gmail.com> Closes: https://lore.kernel.org/netdev/CAP=Rh=M1LzunrcQB1fSGauMrJrhL6GGps5cPAKzHJXj6GQV+-g@mail.gmail.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Paul Moore <paul@paul-moore.com> Link: https://patch.msgid.link/20250522221858.91240-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-28Merge branch ↵Paolo Abeni
'net_sched-hfsc-address-reentrant-enqueue-adding-class-to-eltree-twice' Pedro Tammela says: ==================== net_sched: hfsc: Address reentrant enqueue adding class to eltree twice Savino says: "We are writing to report that this recent patch (141d34391abbb315d68556b7c67ad97885407547) can be bypassed, and a UAF can still occur when HFSC is utilized with NETEM. The patch only checks the cl->cl_nactive field to determine whether it is the first insertion or not, but this field is only incremented by init_vf. By using HFSC_RSC (which uses init_ed), it is possible to bypass the check and insert the class twice in the eltree. Under normal conditions, this would lead to an infinite loop in hfsc_dequeue for the reasons we already explained in this report. However, if TBF is added as root qdisc and it is configured with a very low rate, it can be utilized to prevent packets from being dequeued. This behavior can be exploited to perform subsequent insertions in the HFSC eltree and cause a UAF." To fix both the UAF and the infinite loop, with netem as an hfsc child, check explicitly in hfsc_enqueue whether the class is already in the eltree whenever the HFSC_RSC flag is set. Also add a TDC test to reproduce the UAF scenario. ==================== Link: https://patch.msgid.link/20250522181448.1439717-1-pctammela@mojatatu.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-28selftests/tc-testing: Add a test for HFSC eltree double add with reentrant ↵Pedro Tammela
enqueue behaviour on netem Reproduce the UAF scenario where netem is a child of HFSC and HFSC is configured to use the eltree. In such case, this TDC test would cause the HFSC class to be added to the eltree twice resulting in a UAF. Reviewed-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Link: https://patch.msgid.link/20250522181448.1439717-3-pctammela@mojatatu.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-28net_sched: hfsc: Address reentrant enqueue adding class to eltree twicePedro Tammela
Savino says: "We are writing to report that this recent patch (141d34391abbb315d68556b7c67ad97885407547) [1] can be bypassed, and a UAF can still occur when HFSC is utilized with NETEM. The patch only checks the cl->cl_nactive field to determine whether it is the first insertion or not [2], but this field is only incremented by init_vf [3]. By using HFSC_RSC (which uses init_ed) [4], it is possible to bypass the check and insert the class twice in the eltree. Under normal conditions, this would lead to an infinite loop in hfsc_dequeue for the reasons we already explained in this report [5]. However, if TBF is added as root qdisc and it is configured with a very low rate, it can be utilized to prevent packets from being dequeued. This behavior can be exploited to perform subsequent insertions in the HFSC eltree and cause a UAF." To fix both the UAF and the infinite loop, with netem as an hfsc child, check explicitly in hfsc_enqueue whether the class is already in the eltree whenever the HFSC_RSC flag is set. [1] https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=141d34391abbb315d68556b7c67ad97885407547 [2] https://elixir.bootlin.com/linux/v6.15-rc5/source/net/sched/sch_hfsc.c#L1572 [3] https://elixir.bootlin.com/linux/v6.15-rc5/source/net/sched/sch_hfsc.c#L677 [4] https://elixir.bootlin.com/linux/v6.15-rc5/source/net/sched/sch_hfsc.c#L1574 [5] https://lore.kernel.org/netdev/8DuRWwfqjoRDLDmBMlIfbrsZg9Gx50DHJc1ilxsEBNe2D6NMoigR_eIRIG0LOjMc3r10nUUZtArXx4oZBIdUfZQrwjcQhdinnMis_0G7VEk=@willsroot.io/T/#u Fixes: 37d9cf1a3ce3 ("sched: Fix detection of empty queues in child qdiscs") Reported-by: Savino Dicanosa <savy@syst3mfailure.io> Reported-by: William Liu <will@willsroot.io> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Link: https://patch.msgid.link/20250522181448.1439717-2-pctammela@mojatatu.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-28octeontx2-pf: QOS: Refactor TC_HTB_LEAF_DEL_LAST callbackHariprasad Kelam
This patch addresses below issues, 1. Active traffic on the leaf node must be stopped before its send queue is reassigned to the parent. This patch resolves the issue by marking the node as 'Inner'. 2. During a system reboot, the interface receives TC_HTB_LEAF_DEL and TC_HTB_LEAF_DEL_LAST callbacks to delete its HTB queues. In the case of TC_HTB_LEAF_DEL_LAST, although the same send queue is reassigned to the parent, the current logic still attempts to update the real number of queues, leadning to below warnings New queues can't be registered after device unregistration. WARNING: CPU: 0 PID: 6475 at net/core/net-sysfs.c:1714 netdev_queue_update_kobjects+0x1e4/0x200 Fixes: 5e6808b4c68d ("octeontx2-pf: Add support for HTB offload") Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250522115842.1499666-1-hkelam@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-28octeontx2-pf: QOS: Perform cache sync on send queue teardownHariprasad Kelam
QOS is designed to create a new send queue whenever a class is created, ensuring proper shaping and scheduling. However, when multiple send queues are created and deleted in a loop, SMMU errors are observed. This patch addresses the issue by performing an data cache sync during the teardown of QOS send queues. Fixes: ab6dddd2a669 ("octeontx2-pf: qos send queues management") Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250522094742.1498295-1-hkelam@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-28net: mana: Add support for Multi Vports on Bare metalHaiyang Zhang
To support Multi Vports on Bare metal, increase the device config response version. And, skip the register HW vport, and register filter steps, when the Bare metal hostmode is set. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Link: https://patch.msgid.link/1747671636-5810-1-git-send-email-haiyangz@microsoft.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-27Merge tag 'sched_ext-for-6.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext updates from Tejun Heo: - More in-kernel idle CPU selection improvements. Expand topology awareness coverage add scx_bpf_select_cpu_and() to allow more flexibility. The idle CPU selection kfuncs can now be called from unlocked contexts too. - A bunch of reorganization changes to lay the foundation for multiple hierarchical scheduler support. This isn't ready yet and the included changes don't make meaningful behavior differences. One notable change is replacing some static_key tests with dynamic tests as the test results may differ depending on the scheduler instance. This isn't expected to cause meaningful performance difference. - Other minor and doc updates. - There were multiple patches in for-6.15-fixes which conflicted with changes in for-6.16. for-6.15-fixes were pulled three times into for-6.16 to resolve the conflicts. * tag 'sched_ext-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: (49 commits) sched_ext: Call ops.update_idle() after updating builtin idle bits sched_ext, docs: convert mentions of "CFS" to "fair-class scheduler" selftests/sched_ext: Update test enq_select_cpu_fails sched_ext: idle: Consolidate default idle CPU selection kfuncs selftests/sched_ext: Add test for scx_bpf_select_cpu_and() via test_run sched_ext: idle: Allow scx_bpf_select_cpu_and() from unlocked context sched_ext: idle: Validate locking correctness in scx_bpf_select_cpu_and() sched_ext: Make scx_kf_allowed_if_unlocked() available outside ext.c sched_ext, docs: add label sched_ext: Explain the temporary situation around scx_root dereferences sched_ext: Add @sch to SCX_CALL_OP*() sched_ext: Cleanup [__]scx_exit/error*() sched_ext: Add @sch to SCX_CALL_OP*() sched_ext: Clean up scx_root usages Documentation: scheduler: Changed lowercase acronyms to uppercase sched_ext: Avoid NULL scx_root deref in __scx_exit() sched_ext: Add RCU protection to scx_root in DSQ iterator sched_ext: Clean up SCX_EXIT_NONE handling in scx_disable_workfn() sched_ext: Move disable machinery into scx_sched sched_ext: Move event_stats_cpu into scx_sched ...
2025-05-27Merge tag 'cgroup-for-6.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: - cgroup rstat shared the tracking tree across all controllers with the rationale being that a cgroup which is using one resource is likely to be using other resources at the same time (ie. if something is allocating memory, it's probably consuming CPU cycles). However, this turned out to not scale very well especially with memcg using rstat for internal operations which made memcg stat read and flush patterns substantially different from other controllers. JP Kobryn split the rstat tree per controller. - cgroup BPF support was hooking into cgroup init/exit paths directly. Convert them to use a notifier chain instead so that other usages can be added easily. The two of the patches which implement this are mislabeled as belonging to sched_ext instead of cgroup. Sorry. - Relatively minor cpuset updates - Documentation updates * tag 'cgroup-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (23 commits) sched_ext: Convert cgroup BPF support to use cgroup_lifetime_notifier sched_ext: Introduce cgroup_lifetime_notifier cgroup: Minor reorganization of cgroup_create() cgroup, docs: cpu controller's interaction with various scheduling policies cgroup, docs: convert space indentation to tab indentation cgroup: avoid per-cpu allocation of size zero rstat cpu locks cgroup, docs: be specific about bandwidth control of rt processes cgroup: document the rstat per-cpu initialization cgroup: helper for checking rstat participation of css cgroup: use subsystem-specific rstat locks to avoid contention cgroup: use separate rstat trees for each subsystem cgroup: compare css to cgroup::self in helper for distingushing css cgroup: warn on rstat usage by early init subsystems cgroup/cpuset: drop useless cpumask_empty() in compute_effective_exclusive_cpumask() cgroup/rstat: Improve cgroup_rstat_push_children() documentation cgroup: fix goto ordering in cgroup_init() cgroup: fix pointer check in css_rstat_init() cgroup/cpuset: Add warnings to catch inconsistency in exclusive CPUs cgroup/cpuset: Fix obsolete comment in cpuset_css_offline() cgroup/cpuset: Always use cpu_active_mask ...
2025-05-27Merge tag 'wq-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds
Pull workqueue updates from Tejun Heo: "Fix statistic update race condition and a couple documentation updates" * tag 'wq-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: fix typo in comment workqueue: Fix race condition in wq->stats incrementation workqueue: Better document teardown for delayed_work