summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-10-04drm/i915/dsb: Introduce intel_dsb_vblank_evade()Ville Syrjälä
Add a helper for performing vblank evasion on the DSB. DSB based plane updates will need this to guarantee all the double buffered arming registers will get programmed atomically within the same frame. With VRR we more or less have two vblanks to worry about: - vmax vblank start in case no push was sent - vmin vblank start in case a push was already sent during the vertical active. Only a concern for mailbox updates, which I suppose could happen if the legacy cursor updates take the non-fastpath without setting state->legacy_cursor_update to false. Since we don't know which case is relevant we'll just evade both. We must also make sure to evade both the delayed vblank (for pipe/plane registers) and the undelayed vblank (for transcoder registers and chained DSBs w/ DSB_WAIT_FOR_VBLANK). TODO: come up with a sensible usec number for the evasion... Reviewed-by: Animesh Manna <animesh.manna@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240930170415.23841-6-ville.syrjala@linux.intel.com
2024-10-04drm/i915/dsb: Enable programmable DSB interruptVille Syrjälä
The DSB can signal a programmable interrupt in response to a specific DSB command getting executed. Hook that up. For now we'll just use this to signal the completion of the commit via a vblank event. If, in the future, we'll need to do other things in response to DSB interrupts we may need to come up with some kind of fancier DSB interrupt framework where the caller can specify a custom handler... Reviewed-by: Animesh Manna <animesh.manna@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240930170415.23841-5-ville.syrjala@linux.intel.com
2024-10-04drm/i915/dsb: Generate the DSB buffer in commit_tail()Ville Syrjälä
Once we start using DSB for plane updates we'll need to defer generating the DSB buffer until the clear color has been read out. So we need to move at some of the DSB stuff into commit_tail(). That is perhaps a better place for it anyway as the ioctl thread can move on immediately without spending time building the DSB commands. We always have the MMIO fallback (in case the DSB buffer allocation fails), so there's no real reason to keep any of this in the synchronous part of the ioctl. Because the DSB LUT programming doesn't depend on the plane clear color we can still do that part before waiting for fences/etc. which should help paralleize things a bit more. The DSB plane programming will need to happen after those however as that depends on the clear color. Reviewed-by: Animesh Manna <animesh.manna@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240930170415.23841-4-ville.syrjala@linux.intel.com
2024-10-04drm/i915: Prepare clear color before wait_for_dependencies()Ville Syrjälä
Read out the clear color as soon as fences and the transient data flush have finished. There is no need to wait for all the display specific operations that might still be going on. This could parallelize things a bit more effectively. Reviewed-by: Animesh Manna <animesh.manna@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240930170415.23841-3-ville.syrjala@linux.intel.com
2024-10-04drm/i915/dsb: Avoid reads of the DSB buffer for indexed register writesVille Syrjälä
Reading from the DSB command buffer might be somewhat expensive on discrete GPUs because the buffer resides in GPU local memory. Avoid such reads in the indexed register write handling by tracking the previous instruction in intel_dsb. TODO: actually measure this Reviewed-by: Animesh Manna <animesh.manna@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240930170415.23841-2-ville.syrjala@linux.intel.com
2024-10-03tracing/hwlat: Fix a race during cpuhp processingWei Li
The cpuhp online/offline processing race also exists in percpu-mode hwlat tracer in theory, apply the fix too. That is: T1 | T2 [CPUHP_ONLINE] | cpu_device_down() hwlat_hotplug_workfn() | | cpus_write_lock() | takedown_cpu(1) | cpus_write_unlock() [CPUHP_OFFLINE] | cpus_read_lock() | start_kthread(1) | cpus_read_unlock() | Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20240924094515.3561410-5-liwei391@huawei.com Fixes: ba998f7d9531 ("trace/hwlat: Support hotplug operations") Signed-off-by: Wei Li <liwei391@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-03tracing/timerlat: Fix a race during cpuhp processingWei Li
There is another found exception that the "timerlat/1" thread was scheduled on CPU0, and lead to timer corruption finally: ``` ODEBUG: init active (active state 0) object: ffff888237c2e108 object type: hrtimer hint: timerlat_irq+0x0/0x220 WARNING: CPU: 0 PID: 426 at lib/debugobjects.c:518 debug_print_object+0x7d/0xb0 Modules linked in: CPU: 0 UID: 0 PID: 426 Comm: timerlat/1 Not tainted 6.11.0-rc7+ #45 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 RIP: 0010:debug_print_object+0x7d/0xb0 ... Call Trace: <TASK> ? __warn+0x7c/0x110 ? debug_print_object+0x7d/0xb0 ? report_bug+0xf1/0x1d0 ? prb_read_valid+0x17/0x20 ? handle_bug+0x3f/0x70 ? exc_invalid_op+0x13/0x60 ? asm_exc_invalid_op+0x16/0x20 ? debug_print_object+0x7d/0xb0 ? debug_print_object+0x7d/0xb0 ? __pfx_timerlat_irq+0x10/0x10 __debug_object_init+0x110/0x150 hrtimer_init+0x1d/0x60 timerlat_main+0xab/0x2d0 ? __pfx_timerlat_main+0x10/0x10 kthread+0xb7/0xe0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2d/0x40 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> ``` After tracing the scheduling event, it was discovered that the migration of the "timerlat/1" thread was performed during thread creation. Further analysis confirmed that it is because the CPU online processing for osnoise is implemented through workers, which is asynchronous with the offline processing. When the worker was scheduled to create a thread, the CPU may has already been removed from the cpu_online_mask during the offline process, resulting in the inability to select the right CPU: T1 | T2 [CPUHP_ONLINE] | cpu_device_down() osnoise_hotplug_workfn() | | cpus_write_lock() | takedown_cpu(1) | cpus_write_unlock() [CPUHP_OFFLINE] | cpus_read_lock() | start_kthread(1) | cpus_read_unlock() | To fix this, skip online processing if the CPU is already offline. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20240924094515.3561410-4-liwei391@huawei.com Fixes: c8895e271f79 ("trace/osnoise: Support hotplug operations") Signed-off-by: Wei Li <liwei391@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-03tracing/timerlat: Drop interface_lock in stop_kthread()Wei Li
stop_kthread() is the offline callback for "trace/osnoise:online", since commit 5bfbcd1ee57b ("tracing/timerlat: Add interface_lock around clearing of kthread in stop_kthread()"), the following ABBA deadlock scenario is introduced: T1 | T2 [BP] | T3 [AP] osnoise_hotplug_workfn() | work_for_cpu_fn() | cpuhp_thread_fun() | _cpu_down() | osnoise_cpu_die() mutex_lock(&interface_lock) | | stop_kthread() | cpus_write_lock() | mutex_lock(&interface_lock) cpus_read_lock() | cpuhp_kick_ap() | As the interface_lock here in just for protecting the "kthread" field of the osn_var, use xchg() instead to fix this issue. Also use for_each_online_cpu() back in stop_per_cpu_kthreads() as it can take cpu_read_lock() again. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20240924094515.3561410-3-liwei391@huawei.com Fixes: 5bfbcd1ee57b ("tracing/timerlat: Add interface_lock around clearing of kthread in stop_kthread()") Signed-off-by: Wei Li <liwei391@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-03tracing/timerlat: Fix duplicated kthread creation due to CPU online/offlineWei Li
osnoise_hotplug_workfn() is the asynchronous online callback for "trace/osnoise:online". It may be congested when a CPU goes online and offline repeatedly and is invoked for multiple times after a certain online. This will lead to kthread leak and timer corruption. Add a check in start_kthread() to prevent this situation. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20240924094515.3561410-2-liwei391@huawei.com Fixes: c8895e271f79 ("trace/osnoise: Support hotplug operations") Signed-off-by: Wei Li <liwei391@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-03x86/ftrace: Include <asm/ptrace.h>Sami Tolvanen
<asm/ftrace.h> uses struct pt_regs in several places. Include <asm/ptrace.h> to ensure it's visible. This is needed to make sure object files that only include <asm/asm-prototypes.h> compile. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/20240916221557.846853-2-samitolvanen@google.com Suggested-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-03rtla: Fix the help text in osnoise and timerlat top toolsEder Zulian
The help text in osnoise top and timerlat top had some minor errors and omissions. The -d option was missing the 's' (second) abbreviation and the error message for '-d' used '-D'. Cc: stable@vger.kernel.org Fixes: 1eceb2fc2ca54 ("rtla/osnoise: Add osnoise top mode") Fixes: a828cd18bc4ad ("rtla: Add timerlat tool and timelart top mode") Link: https://lore.kernel.org/20240813155831.384446-1-ezulian@redhat.com Suggested-by: Tomas Glozar <tglozar@redhat.com> Reviewed-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Eder Zulian <ezulian@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-03tools/rtla: Fix installation from out-of-tree buildBen Hutchings
rtla now supports out-of-tree builds, but installation fails as it still tries to install the rtla binary from the source tree. Use the existing macro $(RTLA) to refer to the binary. Link: https://lore.kernel.org/ZudubuoU_JHjPZ7w@decadent.org.uk Fixes: 01474dc706ca ("tools/rtla: Use tools/build makefiles to build rtla") Reviewed-by: Tomas Glozar <tglozar@redhat.com> Tested-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Ben Hutchings <benh@debian.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-03tracing: Fix trace_check_vprintf() when tp_printk is usedSteven Rostedt
When the tp_printk kernel command line is used, the trace events go directly to printk(). It is still checked via the trace_check_vprintf() function to make sure the pointers of the trace event are legit. The addition of reading buffers from previous boots required adding a delta between the addresses of the previous boot and the current boot so that the pointers in the old buffer can still be used. But this required adding a trace_array pointer to acquire the delta offsets. The tp_printk code does not provide a trace_array (tr) pointer, so when the offsets were examined, a NULL pointer dereference happened and the kernel crashed. If the trace_array does not exist, just default the delta offsets to zero, as that also means the trace event is not being read from a previous boot. Link: https://lore.kernel.org/all/Zv3z5UsG_jsO9_Tb@aschofie-mobl2.lan/ Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20241003104925.4e1b1fd9@gandalf.local.home Fixes: 07714b4bb3f98 ("tracing: Handle old buffer mappings for event strings and functions") Reported-by: Alison Schofield <alison.schofield@intel.com> Tested-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-03drm/bridge: it6505: Drop EDID cache on bridge power offPin-yen Lin
The bridge might miss the display change events when it's powered off. This happens when a user changes the external monitor when the system is suspended and the embedded controller doesn't not wake AP up. It's also observed that one DP-to-HDMI bridge doesn't work correctly when there is no EDID read after it is powered on. Drop the cache to force an EDID read after system resume to fix this. Fixes: 11feaef69d0c ("drm/bridge: it6505: Add caching for EDID") Signed-off-by: Pin-yen Lin <treapking@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240926092931.3870342-3-treapking@chromium.org
2024-10-03drm/bridge: anx7625: Drop EDID cache on bridge power offPin-yen Lin
The bridge might miss the display change events when it's powered off. This happens when a user changes the external monitor when the system is suspended and the embedded controller doesn't not wake AP up. It's also observed that one DP-to-HDMI bridge doesn't work correctly when there is no EDID read after it is powered on. Drop the cache to force an EDID read after system resume to fix this. Fixes: 8bdfc5dae4e3 ("drm/bridge: anx7625: Add anx7625 MIPI DSI/DPI to DP") Signed-off-by: Pin-yen Lin <treapking@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240926092931.3870342-2-treapking@chromium.org
2024-10-03drm/msm/a6xx: Enable preemption for tested a7xx targetsAntonino Maniscalco
Initialize with 4 rings to enable preemption. Add the "preemption_enabled" module parameter to override this. Tested-by: Rob Clark <robdclark@gmail.com> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8450-HDK Signed-off-by: Antonino Maniscalco <antomani103@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/618029/ Signed-off-by: Rob Clark <robdclark@chromium.org>
2024-10-03drm/msm/a6xx: Add a flag to allow preemption to submitqueue_createAntonino Maniscalco
Some userspace changes are necessary so add a flag for userspace to advertise support for preemption when creating the submitqueue. When this flag is not set preemption will not be allowed in the middle of the submitted IBs therefore mantaining compatibility with older userspace. The flag is rejected if preemption is not supported on the target, this allows userspace to know whether preemption is supported. Tested-by: Rob Clark <robdclark@gmail.com> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8450-HDK Signed-off-by: Antonino Maniscalco <antomani103@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/618028/ Signed-off-by: Rob Clark <robdclark@chromium.org>
2024-10-03drm/msm/a6xx: Add traces for preemptionAntonino Maniscalco
Add trace points corresponding to preemption being triggered and being completed for latency measurement purposes. Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Tested-by: Rob Clark <robdclark@gmail.com> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8450-HDK Signed-off-by: Antonino Maniscalco <antomani103@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/618026/ Signed-off-by: Rob Clark <robdclark@chromium.org>
2024-10-03drm/msm/a6xx: Use posamble to reset counters on preemptionAntonino Maniscalco
Use the postamble to reset perf counters when switching between rings, except when sysprof is enabled, analogously to how they are reset between submissions when switching pagetables. Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Tested-by: Rob Clark <robdclark@gmail.com> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8450-HDK Signed-off-by: Antonino Maniscalco <antomani103@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/618024/ Signed-off-by: Rob Clark <robdclark@chromium.org>
2024-10-03drm/msm/a6xx: Sync relevant adreno_pm4.xml changesAntonino Maniscalco
In mesa CP_SET_CTXSWITCH_IB is renamed to CP_SET_AMBLE and some other names are changed to match KGSL. Import those changes. The changes have not been merged yet in mesa but are necessary for this series. Tested-by: Rob Clark <robdclark@gmail.com> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8450-HDK Signed-off-by: Antonino Maniscalco <antomani103@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/618023/ Signed-off-by: Rob Clark <robdclark@chromium.org>
2024-10-03drm/msm/a6xx: Implement preemption for a7xx targetsAntonino Maniscalco
This patch implements preemption feature for A6xx targets, this allows the GPU to switch to a higher priority ringbuffer if one is ready. A6XX hardware as such supports multiple levels of preemption granularities, ranging from coarse grained(ringbuffer level) to a more fine grained such as draw-call level or a bin boundary level preemption. This patch enables the basic preemption level, with more fine grained preemption support to follow. Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Tested-by: Rob Clark <robdclark@gmail.com> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8450-HDK Signed-off-by: Sharat Masetty <smasetty@codeaurora.org> Signed-off-by: Antonino Maniscalco <antomani103@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/618021/ Signed-off-by: Rob Clark <robdclark@chromium.org>
2024-10-03drm/msm/a6xx: Add a pwrup_list field to a6xx_infoAntonino Maniscalco
Add a field to contain the pwup_reglist needed for preemption. Signed-off-by: Antonino Maniscalco <antomani103@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/618018/ Signed-off-by: Rob Clark <robdclark@chromium.org>
2024-10-03nfs_common: fix Kconfig for NFS_COMMON_LOCALIO_SUPPORTMike Snitzer
The 'default n' that was in NFS_COMMON_LOCALIO_SUPPORT caused these extra defaults to be missed: default y if NFSD=y || NFS_FS=y default m if NFSD=m && NFS_FS=m Remove the 'default n' for NFS_COMMON_LOCALIO_SUPPORT so that the correct tristate is selected based on how NFSD and NFS_FS are configured. This fixes the reported case where NFS_FS=y but NFS_COMMON_LOCALIO_SUPPORT=m, it is now correctly set to =y. In addition, add extra 'depends on NFS_LOCALIO' to NFS_COMMON_LOCALIO_SUPPORT so that if NFS_LOCALIO isn't set then NFS_COMMON_LOCALIO_SUPPORT will not be either. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202410031944.hMCFY9BO-lkp@intel.com/ Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-10-03nfs_common: fix race in NFS calls to nfsd_file_put_local() and nfsd_serv_put()Mike Snitzer
Add nfs_to_nfsd_file_put_local() interface to fix race with nfsd module unload. Similarly, use RCU around nfs_open_local_fh()'s error path call to nfs_to->nfsd_serv_put(). Holding RCU ensures that NFS will safely _call and return_ from its nfs_to calls into the NFSD functions nfsd_file_put_local() and nfsd_serv_put(). Otherwise, if RCU isn't used then there is a narrow window when NFS's reference for the nfsd_file and nfsd_serv are dropped and the NFSD module could be unloaded, which could result in a crash from the return instruction for either nfs_to->nfsd_file_put_local() or nfs_to->nfsd_serv_put(). Reported-by: NeilBrown <neilb@suse.de> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-10-03NFSv4: Prevent NULL-pointer dereference in nfs42_complete_copies()Yanjun Zhang
On the node of an NFS client, some files saved in the mountpoint of the NFS server were copied to another location of the same NFS server. Accidentally, the nfs42_complete_copies() got a NULL-pointer dereference crash with the following syslog: [232064.838881] NFSv4: state recovery failed for open file nfs/pvc-12b5200d-cd0f-46a3-b9f0-af8f4fe0ef64.qcow2, error = -116 [232064.839360] NFSv4: state recovery failed for open file nfs/pvc-12b5200d-cd0f-46a3-b9f0-af8f4fe0ef64.qcow2, error = -116 [232066.588183] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000058 [232066.588586] Mem abort info: [232066.588701] ESR = 0x0000000096000007 [232066.588862] EC = 0x25: DABT (current EL), IL = 32 bits [232066.589084] SET = 0, FnV = 0 [232066.589216] EA = 0, S1PTW = 0 [232066.589340] FSC = 0x07: level 3 translation fault [232066.589559] Data abort info: [232066.589683] ISV = 0, ISS = 0x00000007 [232066.589842] CM = 0, WnR = 0 [232066.589967] user pgtable: 64k pages, 48-bit VAs, pgdp=00002000956ff400 [232066.590231] [0000000000000058] pgd=08001100ae100003, p4d=08001100ae100003, pud=08001100ae100003, pmd=08001100b3c00003, pte=0000000000000000 [232066.590757] Internal error: Oops: 96000007 [#1] SMP [232066.590958] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm vhost_net vhost vhost_iotlb tap tun ipt_rpfilter xt_multiport ip_set_hash_ip ip_set_hash_net xfrm_interface xfrm6_tunnel tunnel4 tunnel6 esp4 ah4 wireguard libcurve25519_generic veth xt_addrtype xt_set nf_conntrack_netlink ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_bitmap_port ip_set_hash_ipport dummy ip_set ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs iptable_filter sch_ingress nfnetlink_cttimeout vport_gre ip_gre ip_tunnel gre vport_geneve geneve vport_vxlan vxlan ip6_udp_tunnel udp_tunnel openvswitch nf_conncount dm_round_robin dm_service_time dm_multipath xt_nat xt_MASQUERADE nft_chain_nat nf_nat xt_mark xt_conntrack xt_comment nft_compat nft_counter nf_tables nfnetlink ocfs2 ocfs2_nodemanager ocfs2_stackglue iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ipmi_ssif nbd overlay 8021q garp mrp bonding tls rfkill sunrpc ext4 mbcache jbd2 [232066.591052] vfat fat cas_cache cas_disk ses enclosure scsi_transport_sas sg acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler ip_tables vfio_pci vfio_pci_core vfio_virqfd vfio_iommu_type1 vfio dm_mirror dm_region_hash dm_log dm_mod nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc fuse xfs libcrc32c ast drm_vram_helper qla2xxx drm_kms_helper syscopyarea crct10dif_ce sysfillrect ghash_ce sysimgblt sha2_ce fb_sys_fops cec sha256_arm64 sha1_ce drm_ttm_helper ttm nvme_fc igb sbsa_gwdt nvme_fabrics drm nvme_core i2c_algo_bit i40e scsi_transport_fc megaraid_sas aes_neon_bs [232066.596953] CPU: 6 PID: 4124696 Comm: 10.253.166.125- Kdump: loaded Not tainted 5.15.131-9.cl9_ocfs2.aarch64 #1 [232066.597356] Hardware name: Great Wall .\x93\x8e...RF6260 V5/GWMSSE2GL1T, BIOS T656FBE_V3.0.18 2024-01-06 [232066.597721] pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [232066.598034] pc : nfs4_reclaim_open_state+0x220/0x800 [nfsv4] [232066.598327] lr : nfs4_reclaim_open_state+0x12c/0x800 [nfsv4] [232066.598595] sp : ffff8000f568fc70 [232066.598731] x29: ffff8000f568fc70 x28: 0000000000001000 x27: ffff21003db33000 [232066.599030] x26: ffff800005521ae0 x25: ffff0100f98fa3f0 x24: 0000000000000001 [232066.599319] x23: ffff800009920008 x22: ffff21003db33040 x21: ffff21003db33050 [232066.599628] x20: ffff410172fe9e40 x19: ffff410172fe9e00 x18: 0000000000000000 [232066.599914] x17: 0000000000000000 x16: 0000000000000004 x15: 0000000000000000 [232066.600195] x14: 0000000000000000 x13: ffff800008e685a8 x12: 00000000eac0c6e6 [232066.600498] x11: 0000000000000000 x10: 0000000000000008 x9 : ffff8000054e5828 [232066.600784] x8 : 00000000ffffffbf x7 : 0000000000000001 x6 : 000000000a9eb14a [232066.601062] x5 : 0000000000000000 x4 : ffff70ff8a14a800 x3 : 0000000000000058 [232066.601348] x2 : 0000000000000001 x1 : 54dce46366daa6c6 x0 : 0000000000000000 [232066.601636] Call trace: [232066.601749] nfs4_reclaim_open_state+0x220/0x800 [nfsv4] [232066.601998] nfs4_do_reclaim+0x1b8/0x28c [nfsv4] [232066.602218] nfs4_state_manager+0x928/0x10f0 [nfsv4] [232066.602455] nfs4_run_state_manager+0x78/0x1b0 [nfsv4] [232066.602690] kthread+0x110/0x114 [232066.602830] ret_from_fork+0x10/0x20 [232066.602985] Code: 1400000d f9403f20 f9402e61 91016003 (f9402c00) [232066.603284] SMP: stopping secondary CPUs [232066.606936] Starting crashdump kernel... [232066.607146] Bye! Analysing the vmcore, we know that nfs4_copy_state listed by destination nfs_server->ss_copies was added by the field copies in handle_async_copy(), and we found a waiting copy process with the stack as: PID: 3511963 TASK: ffff710028b47e00 CPU: 0 COMMAND: "cp" #0 [ffff8001116ef740] __switch_to at ffff8000081b92f4 #1 [ffff8001116ef760] __schedule at ffff800008dd0650 #2 [ffff8001116ef7c0] schedule at ffff800008dd0a00 #3 [ffff8001116ef7e0] schedule_timeout at ffff800008dd6aa0 #4 [ffff8001116ef860] __wait_for_common at ffff800008dd166c #5 [ffff8001116ef8e0] wait_for_completion_interruptible at ffff800008dd1898 #6 [ffff8001116ef8f0] handle_async_copy at ffff8000055142f4 [nfsv4] #7 [ffff8001116ef970] _nfs42_proc_copy at ffff8000055147c8 [nfsv4] #8 [ffff8001116efa80] nfs42_proc_copy at ffff800005514cf0 [nfsv4] #9 [ffff8001116efc50] __nfs4_copy_file_range.constprop.0 at ffff8000054ed694 [nfsv4] The NULL-pointer dereference was due to nfs42_complete_copies() listed the nfs_server->ss_copies by the field ss_copies of nfs4_copy_state. So the nfs4_copy_state address ffff0100f98fa3f0 was offset by 0x10 and the data accessed through this pointer was also incorrect. Generally, the ordered list nfs4_state_owner->so_states indicate open(O_RDWR) or open(O_WRITE) states are reclaimed firstly by nfs4_reclaim_open_state(). When destination state reclaim is failed with NFS_STATE_RECOVERY_FAILED and copies are not deleted in nfs_server->ss_copies, the source state may be passed to the nfs42_complete_copies() process earlier, resulting in this crash scene finally. To solve this issue, we add a list_head nfs_server->ss_src_copies for a server-to-server copy specially. Fixes: 0e65a32c8a56 ("NFS: handle source server reboot") Signed-off-by: Yanjun Zhang <zhangyanjun@cestc.cn> Reviewed-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-10-03SUNRPC: Fix integer overflow in decode_rc_list()Dan Carpenter
The math in "rc_list->rcl_nrefcalls * 2 * sizeof(uint32_t)" could have an integer overflow. Add bounds checking on rc_list->rcl_nrefcalls to fix that. Fixes: 4aece6a19cf7 ("nfs41: cb_sequence xdr implementation") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-10-03drm/msm: Add CONTEXT_SWITCH_CNTL bitfieldsAntonino Maniscalco
Add missing bitfields to CONTEXT_SWITCH_CNTL in a6xx.xml. Tested-by: Rob Clark <robdclark@gmail.com> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8450-HDK Signed-off-by: Antonino Maniscalco <antomani103@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/618016/ Signed-off-by: Rob Clark <robdclark@chromium.org>
2024-10-03drm/msm: Add a `preempt_record_size` fieldAntonino Maniscalco
Adds a field to `adreno_info` to store the GPU specific preempt record size. Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Tested-by: Rob Clark <robdclark@gmail.com> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8450-HDK Signed-off-by: Antonino Maniscalco <antomani103@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/618015/ Signed-off-by: Rob Clark <robdclark@chromium.org>
2024-10-03drm/msm/a6xx: Track current_ctx_seqno per ringAntonino Maniscalco
With preemption it is not enough to track the current_ctx_seqno globally as execution might switch between rings. This is especially problematic when current_ctx_seqno is used to determine whether a page table switch is necessary as it might lead to security bugs. Track current context per ring. Tested-by: Rob Clark <robdclark@gmail.com> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8450-HDK Signed-off-by: Antonino Maniscalco <antomani103@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/618012/ Signed-off-by: Rob Clark <robdclark@chromium.org>
2024-10-03drm/msm: Fix bv_fence being used as bv_rptrAntonino Maniscalco
The bv_fence field of rbmemptrs was being used incorrectly as the BV rptr shadow pointer in some places. Add a bv_rptr field and change the code to use that instead. Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Tested-by: Rob Clark <robdclark@gmail.com> Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8650-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8450-HDK Signed-off-by: Antonino Maniscalco <antomani103@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/618010/ Signed-off-by: Rob Clark <robdclark@chromium.org>
2024-10-03nvme: tcp: avoid race between queue_lock lock and destroyHannes Reinecke
Commit 76d54bf20cdc ("nvme-tcp: don't access released socket during error recovery") added a mutex_lock() call for the queue->queue_lock in nvme_tcp_get_address(). However, the mutex_lock() races with mutex_destroy() in nvme_tcp_free_queue(), and causes the WARN below. DEBUG_LOCKS_WARN_ON(lock->magic != lock) WARNING: CPU: 3 PID: 34077 at kernel/locking/mutex.c:587 __mutex_lock+0xcf0/0x1220 Modules linked in: nvmet_tcp nvmet nvme_tcp nvme_fabrics iw_cm ib_cm ib_core pktcdvd nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables qrtr sunrpc ppdev 9pnet_virtio 9pnet pcspkr netfs parport_pc parport e1000 i2c_piix4 i2c_smbus loop fuse nfnetlink zram bochs drm_vram_helper drm_ttm_helper ttm drm_kms_helper xfs drm sym53c8xx floppy nvme scsi_transport_spi nvme_core nvme_auth serio_raw ata_generic pata_acpi dm_multipath qemu_fw_cfg [last unloaded: ib_uverbs] CPU: 3 UID: 0 PID: 34077 Comm: udisksd Not tainted 6.11.0-rc7 #319 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014 RIP: 0010:__mutex_lock+0xcf0/0x1220 Code: 08 84 d2 0f 85 c8 04 00 00 8b 15 ef b6 c8 01 85 d2 0f 85 78 f4 ff ff 48 c7 c6 20 93 ee af 48 c7 c7 60 91 ee af e8 f0 a7 6d fd <0f> 0b e9 5e f4 ff ff 48 b8 00 00 00 00 00 fc ff df 4c 89 f2 48 c1 RSP: 0018:ffff88811305f760 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff88812c652058 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000001 RBP: ffff88811305f8b0 R08: 0000000000000001 R09: ffffed1075c36341 R10: ffff8883ae1b1a0b R11: 0000000000010498 R12: 0000000000000000 R13: 0000000000000000 R14: dffffc0000000000 R15: ffff88812c652058 FS: 00007f9713ae4980(0000) GS:ffff8883ae180000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fcd78483c7c CR3: 0000000122c38000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ? __warn.cold+0x5b/0x1af ? __mutex_lock+0xcf0/0x1220 ? report_bug+0x1ec/0x390 ? handle_bug+0x3c/0x80 ? exc_invalid_op+0x13/0x40 ? asm_exc_invalid_op+0x16/0x20 ? __mutex_lock+0xcf0/0x1220 ? nvme_tcp_get_address+0xc2/0x1e0 [nvme_tcp] ? __pfx___mutex_lock+0x10/0x10 ? __lock_acquire+0xd6a/0x59e0 ? nvme_tcp_get_address+0xc2/0x1e0 [nvme_tcp] nvme_tcp_get_address+0xc2/0x1e0 [nvme_tcp] ? __pfx_nvme_tcp_get_address+0x10/0x10 [nvme_tcp] nvme_sysfs_show_address+0x81/0xc0 [nvme_core] dev_attr_show+0x42/0x80 ? __asan_memset+0x1f/0x40 sysfs_kf_seq_show+0x1f0/0x370 seq_read_iter+0x2cb/0x1130 ? rw_verify_area+0x3b1/0x590 ? __mutex_lock+0x433/0x1220 vfs_read+0x6a6/0xa20 ? lockdep_hardirqs_on+0x78/0x100 ? __pfx_vfs_read+0x10/0x10 ksys_read+0xf7/0x1d0 ? __pfx_ksys_read+0x10/0x10 ? __x64_sys_openat+0x105/0x1d0 do_syscall_64+0x93/0x180 ? lockdep_hardirqs_on_prepare+0x16d/0x400 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on+0x78/0x100 ? do_syscall_64+0x9f/0x180 ? __pfx_ksys_read+0x10/0x10 ? lockdep_hardirqs_on_prepare+0x16d/0x400 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on+0x78/0x100 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on_prepare+0x16d/0x400 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on+0x78/0x100 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on_prepare+0x16d/0x400 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on+0x78/0x100 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on_prepare+0x16d/0x400 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on+0x78/0x100 ? do_syscall_64+0x9f/0x180 ? do_syscall_64+0x9f/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f9713f55cfa Code: 55 48 89 e5 48 83 ec 20 48 89 55 e8 48 89 75 f0 89 7d f8 e8 e8 74 f8 ff 48 8b 55 e8 48 8b 75 f0 41 89 c0 8b 7d f8 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 2e 44 89 c7 48 89 45 f8 e8 42 75 f8 ff 48 8b RSP: 002b:00007ffd7f512e70 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 000055c38f316859 RCX: 00007f9713f55cfa RDX: 0000000000000fff RSI: 00007ffd7f512eb0 RDI: 0000000000000011 RBP: 00007ffd7f512e90 R08: 0000000000000000 R09: 00000000ffffffff R10: 0000000000000000 R11: 0000000000000246 R12: 000055c38f317148 R13: 0000000000000000 R14: 00007f96f4004f30 R15: 000055c3b6b623c0 </TASK> The WARN is observed when the blktests test case nvme/014 is repeated with tcp transport. It is rare, and 200 times repeat is required to recreate in some test environments. To avoid the WARN, check the NVME_TCP_Q_LIVE flag before locking queue->queue_lock. The flag is cleared long time before the lock gets destroyed. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-10-03gpiolib: Fix potential NULL pointer dereference in gpiod_get_label()Lad Prabhakar
In `gpiod_get_label()`, it is possible that `srcu_dereference_check()` may return a NULL pointer, leading to a scenario where `label->str` is accessed without verifying if `label` itself is NULL. This patch adds a proper NULL check for `label` before accessing `label->str`. The check for `label->str != NULL` is removed because `label->str` can never be NULL if `label` is not NULL. This fixes the issue where the label name was being printed as `(efault)` when dumping the sysfs GPIO file when `label == NULL`. Fixes: 5a646e03e956 ("gpiolib: Return label, if set, for IRQ only line") Fixes: a86d27693066 ("gpiolib: fix the speed of descriptor label setting with SRCU") Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Link: https://lore.kernel.org/r/20241003131351.472015-1-prabhakar.mahadev-lad.rj@bp.renesas.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-10-03KVM: arm64: Fix kvm_has_feat*() handling of negative featuresMarc Zyngier
Oliver reports that the kvm_has_feat() helper is not behaviing as expected for negative feature. On investigation, the main issue seems to be caused by the following construct: #define get_idreg_field(kvm, id, fld) \ (id##_##fld##_SIGNED ? \ get_idreg_field_signed(kvm, id, fld) : \ get_idreg_field_unsigned(kvm, id, fld)) where one side of the expression evaluates as something signed, and the other as something unsigned. In retrospect, this is totally braindead, as the compiler converts this into an unsigned expression. When compared to something that is 0, the test is simply elided. Epic fail. Similar issue exists in the expand_field_sign() macro. The correct way to handle this is to chose between signed and unsigned comparisons, so that both sides of the ternary expression are of the same type (bool). In order to keep the code readable (sort of), we introduce new comparison primitives taking an operator as a parameter, and rewrite the kvm_has_feat*() helpers in terms of these primitives. Fixes: c62d7a23b947 ("KVM: arm64: Add feature checking helpers") Reported-by: Oliver Upton <oliver.upton@linux.dev> Tested-by: Oliver Upton <oliver.upton@linux.dev> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20241002204239.2051637-1-maz@kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org>
2024-10-03drm/xe: Restore GT freq on GSC load errorVinay Belgaumkar
As part of a Wa_22019338487, ensure that GT freq is restored even when GSC reload is not successful. Fixes: 3b1592fb7835 ("drm/xe/lnl: Apply Wa_22019338487") Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240925204918.1989574-1-vinay.belgaumkar@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-10-03drm/nouveau/i2c: rename aux.c and aux.h to auxch.c and auxch.hBenjamin Szőke
The goal is to clean-up Linux repository from AUX file names, because the use of such file names is prohibited on other operating systems such as Windows, so the Linux repository cannot be cloned and edited on them. Signed-off-by: Benjamin Szőke <egyszeregy@freemail.hu> Reviewed-by: Ben Skeggs <bskeggs@nvidia.com> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240603091558.35672-1-egyszeregy@freemail.hu
2024-10-03cifs: Do not convert delimiter when parsing NFS-style symlinksPali Rohár
NFS-style symlinks have target location always stored in NFS/UNIX form where backslash means the real UNIX backslash and not the SMB path separator. So do not mangle slash and backslash content of NFS-style symlink during readlink() syscall as it is already in the correct Linux form. This fixes interoperability of NFS-style symlinks with backslashes created by Linux NFS3 client throw Windows NFS server and retrieved by Linux SMB client throw Windows SMB server, where both Windows servers exports the same directory. Fixes: d5ecebc4900d ("smb3: Allow query of symlinks stored as reparse points") Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-10-03cifs: Validate content of NFS reparse point bufferPali Rohár
Symlink target location stored in DataBuffer is encoded in UTF-16. So check that symlink DataBuffer length is non-zero and even number. And check that DataBuffer does not contain UTF-16 null codepoint because Linux cannot process symlink with null byte. DataBuffer for char and block devices is 8 bytes long as it contains two 32-bit numbers (major and minor). Add check for this. DataBuffer buffer for sockets and fifos zero-length. Add checks for this. Signed-off-by: Pali Rohár <pali@kernel.org> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-10-03cifs: Fix buffer overflow when parsing NFS reparse pointsPali Rohár
ReparseDataLength is sum of the InodeType size and DataBuffer size. So to get DataBuffer size it is needed to subtract InodeType's size from ReparseDataLength. Function cifs_strndup_from_utf16() is currentlly accessing buf->DataBuffer at position after the end of the buffer because it does not subtract InodeType size from the length. Fix this problem and correctly subtract variable len. Member InodeType is present only when reparse buffer is large enough. Check for ReparseDataLength before accessing InodeType to prevent another invalid memory access. Major and minor rdev values are present also only when reparse buffer is large enough. Check for reparse buffer size before calling reparse_mkdev(). Fixes: d5ecebc4900d ("smb3: Allow query of symlinks stored as reparse points") Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-10-03Merge tag 'net-6.12-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from ieee802154, bluetooth and netfilter. Current release - regressions: - eth: mlx5: fix wrong reserved field in hca_cap_2 in mlx5_ifc - eth: am65-cpsw: fix forever loop in cleanup code Current release - new code bugs: - eth: mlx5: HWS, fixed double-free in error flow of creating SQ Previous releases - regressions: - core: avoid potential underflow in qdisc_pkt_len_init() with UFO - core: test for not too small csum_start in virtio_net_hdr_to_skb() - vrf: revert "vrf: remove unnecessary RCU-bh critical section" - bluetooth: - fix uaf in l2cap_connect - fix possible crash on mgmt_index_removed - dsa: improve shutdown sequence - eth: mlx5e: SHAMPO, fix overflow of hd_per_wq - eth: ip_gre: fix drops of small packets in ipgre_xmit Previous releases - always broken: - core: fix gso_features_check to check for both dev->gso_{ipv4_,}max_size - core: fix tcp fraglist segmentation after pull from frag_list - netfilter: nf_tables: prevent nf_skb_duplicated corruption - sctp: set sk_state back to CLOSED if autobind fails in sctp_listen_start - mac802154: fix potential RCU dereference issue in mac802154_scan_worker - eth: fec: restart PPS after link state change" * tag 'net-6.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (48 commits) sctp: set sk_state back to CLOSED if autobind fails in sctp_listen_start dt-bindings: net: xlnx,axi-ethernet: Add missing reg minItems doc: net: napi: Update documentation for napi_schedule_irqoff net/ncsi: Disable the ncsi work before freeing the associated structure net: phy: qt2025: Fix warning: unused import DeviceId gso: fix udp gso fraglist segmentation after pull from frag_list bridge: mcast: Fail MDB get request on empty entry vrf: revert "vrf: Remove unnecessary RCU-bh critical section" net: ethernet: ti: am65-cpsw: Fix forever loop in cleanup code net: phy: realtek: Check the index value in led_hw_control_get ppp: do not assume bh is held in ppp_channel_bridge_input() selftests: rds: move include.sh to TEST_FILES net: test for not too small csum_start in virtio_net_hdr_to_skb() net: gso: fix tcp fraglist segmentation after pull from frag_list ipv4: ip_gre: Fix drops of small packets in ipgre_xmit net: stmmac: dwmac4: extend timeout for VLAN Tag register busy bit check net: add more sanity checks to qdisc_pkt_len_init() net: avoid potential underflow in qdisc_pkt_len_init() with UFO net: ethernet: ti: cpsw_ale: Fix warning on some platforms net: microchip: Make FDMA config symbol invisible ...
2024-10-03Merge tag 'v6.12-rc1-ksmbd-fixes' of git://git.samba.org/ksmbdLinus Torvalds
Pull smb server fixes from Steve French: - small cleanup patches leveraging struct size to improve access bounds checking * tag 'v6.12-rc1-ksmbd-fixes' of git://git.samba.org/ksmbd: ksmbd: Use struct_size() to improve smb_direct_rdma_xmit() ksmbd: Annotate struct copychunk_ioctl_req with __counted_by_le() ksmbd: Use struct_size() to improve get_file_alternate_info()
2024-10-03Merge tag 'vfs-6.12-rc2.fixes.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: "vfs: - Ensure that iter_folioq_get_pages() advances to the next slot otherwise it will end up using the same folio with an out-of-bound offset. iomap: - Dont unshare delalloc extents which can't be reflinked, and thus can't be shared. - Constrain the file range passed to iomap_file_unshare() directly in iomap instead of requiring the callers to do it. netfs: - Use folioq_count instead of folioq_nr_slot to prevent an unitialized value warning in netfs_clear_buffer(). - Fix missing wakeup after issuing writes by scheduling the write collector only if all the subrequest queues are empty and thus no writes are pending. - Fix two minor documentation bugs" * tag 'vfs-6.12-rc2.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: iomap: constrain the file range passed to iomap_file_unshare iomap: don't bother unsharing delalloc extents netfs: Fix missing wakeup after issuing writes Documentation: add missing folio_queue entry folio_queue: fix documentation netfs: Fix a KMSAN uninit-value error in netfs_clear_buffer iov_iter: fix advancing slot in iter_folioq_get_pages()
2024-10-03docs/gpu: ci: update flake tests requirementsVignesh Raman
Update the documentation to specify linking to a relevant GitLab issue or email report for each new flake entry. Added specific GitLab issue urls for amdgpu, i915, msm and xe driver. Acked-by: Maxime Ripard <mripard@kernel.org> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> #intel and xe Acked-by: Abhinav Kumar <quic_abhinavk@quicinc.com> # msm Acked-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> # msm Signed-off-by: Vignesh Raman <vignesh.raman@collabora.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Helen Koike <helen.koike@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240930095255.2071586-1-vignesh.raman@collabora.com
2024-10-03Merge tag 'intel-pinctrl-v6.12-2' of ↵Linus Walleij
git://git.kernel.org/pub/scm/linux/kernel/git/pinctrl/intel into fixes intel-pinctrl for v6.12-2 Fixes a few issues with Intel pin control platform driver: * fix missing reference counter drop of fwnode on error path * replace comma by semicolon to follow the kernel style * add Panther Lake to the list of supported devices The following is an automated git shortlog grouped by driver: intel: - platform: Add Panther Lake to the list of supported - platform: use semicolon instead of comma in ncommunities assignment - platform: fix error path in device_for_each_child_node() Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2024-10-03drm/xe: Use fault injection infrastructure to find issues at probe timeFrancois Dugast
The kernel fault injection infrastructure is used to test proper error handling during probe. The return code of the functions using ALLOW_ERROR_INJECTION() can be conditionnally modified at runtime by tuning some debugfs entries. This requires CONFIG_FUNCTION_ERROR_INJECTION (among others). One way to use fault injection at probe time by making each of those functions fail one at a time is: FAILTYPE=fail_function DEVICE="0000:00:08.0" # depends on the system ERRNO=-12 # -ENOMEM, can depend on the function echo N > /sys/kernel/debug/$FAILTYPE/task-filter echo 100 > /sys/kernel/debug/$FAILTYPE/probability echo 0 > /sys/kernel/debug/$FAILTYPE/interval echo -1 > /sys/kernel/debug/$FAILTYPE/times echo 0 > /sys/kernel/debug/$FAILTYPE/space echo 1 > /sys/kernel/debug/$FAILTYPE/verbose modprobe xe echo $DEVICE > /sys/bus/pci/drivers/xe/unbind grep -oP "^.* \[xe\]" /sys/kernel/debug/$FAILTYPE/injectable | \ cut -d ' ' -f 1 | while read -r FUNCTION ; do echo "Injecting fault in $FUNCTION" echo "" > /sys/kernel/debug/$FAILTYPE/inject echo $FUNCTION > /sys/kernel/debug/$FAILTYPE/inject printf %#x $ERRNO > /sys/kernel/debug/$FAILTYPE/$FUNCTION/retval echo $DEVICE > /sys/bus/pci/drivers/xe/bind done rmmod xe It will also be integrated into IGT for systematic execution by CI. v2: Wrappers are not needed in the cases covered by this patch, so remove them and use ALLOW_ERROR_INJECTION() directly. v3: Document the use of fault injection at probe time in xe_pci_probe and refer to it where ALLOW_ERROR_INJECTION() is used. Signed-off-by: Francois Dugast <francois.dugast@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240927151207.399354-1-francois.dugast@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-10-03drm/i915/irq: remove GEN8_IRQ_RESET_NDX() and GEN8_IRQ_INIT_NDX() macrosJani Nikula
Define register offset triplets for all registers used with GEN8_IRQ_RESET_NDX() and GEN8_IRQ_INIT_NDX() macros, and call the underlying gen3_irq_reset() and gen3_irq_init() functions directly. Remove the macros, along with the macro name concatenation hackery. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241002102645.136155-3-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-10-03drm/i915/irq: remove GEN3_IRQ_RESET() and GEN3_IRQ_INIT() macrosJani Nikula
Define register offset triplets for all registers used with GEN3_IRQ_RESET() and GEN3_IRQ_INIT() macros, and call the underlying gen3_irq_reset() and gen3_irq_init() functions directly. Remove the macros, along with the macro name concatenation hackery. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241002102645.136155-2-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-10-03drm/i915/irq: add struct i915_irq_regs tripletJani Nikula
Add struct i915_irq_regs to hold IMR/IER/IIR register offsets to pass to gen3_irq_reset() and gen3_irq_init(). This helps in grouping the registers and further cleanup. Note: gen3_irq_reset() and gen3_irq_init() really did have the IMR/IER/IIR parameters in different order. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241002102645.136155-1-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2024-10-03rust: device: change the from_raw() functionGuilherme Giacomo Simoes
The function Device::from_raw() increments a refcount by a call to bindings::get_device(ptr). This can be confused because usually from_raw() functions don't increment a refcount. Hence, rename Device::from_raw() to avoid confuion with other "from_raw" semantics. The new name of function should be "get_device" to be consistent with the function get_device() already exist in .c files. This function body also changed, because the `into()` will convert the `&'a Device` into `ARef<Device>` and also call `inc_ref` from the `AlwaysRefCounted` trait implemented for Device. Signed-off-by: Guilherme Giacomo Simoes <trintaeoitogc@gmail.com> Acked-by: Danilo Krummrich <dakr@kernel.org> Closes: https://github.com/Rust-for-Linux/linux/issues/1088 Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/r/20241001205603.106278-1-trintaeoitogc@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-03drm/i915/dp: Extract intel_edp_set_sink_rates()Ville Syrjälä
Declutter intel_edp_init_dpcd() a bit by extracting the sink rates probing into its own function. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240918190441.29071-3-ville.syrjala@linux.intel.com Reviewed-by: Luca Coelho <luciano.coelho@intel.com>
2024-10-03drm/i915/dp: Make intel_dp_get_colorimetry_status() staticVille Syrjälä
intel_dp_get_colorimetry_status() is not used outside of intel_dp.c. Make it static. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240918190441.29071-2-ville.syrjala@linux.intel.com Reviewed-by: Luca Coelho <luciano.coelho@intel.com>