summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-04-24drm/xe/guc: Fix arguments passed to relay G2H handlersMichal Wajdeczko
By default CT code was passing just payload of the G2H event message, while Relay code expects full G2H message including HXG header which contains DATA0 field. Fix that. Fixes: 26d4481ac23f ("drm/xe/guc: Start handling GuC Relay event messages") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240419150351.358-1-michal.wajdeczko@intel.com (cherry picked from commit 48c64d495fbef343c59598a793d583dfd199d389) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-04-24drm/xe: call free_gsc_pkt only once on action add failureHimal Prasad Ghimiray
The drmm_add_action_or_reset function automatically invokes the action (free_gsc_pkt) in the event of a failure; therefore, there's no necessity to call it within the return check. -v2 Fix commit message. (Lucas) Fixes: d8b1571312b7 ("drm/xe/huc: HuC authentication via GSC") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240412181211.1155732-4-himal.prasad.ghimiray@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 22bf0bc04d273ca002a47de55693797b13076602) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-04-24drm/xe: Remove sysfs only once on action add failureHimal Prasad Ghimiray
The drmm_add_action_or_reset function automatically invokes the action (sysfs removal) in the event of a failure; therefore, there's no necessity to call it within the return check. Modify the return type of xe_gt_ccs_mode_sysfs_init to int, allowing the caller to pass errors up the call chain. Should sysfs creation or drmm_add_action_or_reset fail, error propagation will prompt a driver load abort. -v2 Edit commit message (Nikula/Lucas) use err_force_wake label instead of new. (Lucas) Avoid unnecessary warn/error messages. (Lucas) Fixes: f3bc5bb4d53d ("drm/xe: Allow userspace to configure CCS mode") Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240412181211.1155732-3-himal.prasad.ghimiray@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit a99641e38704202ae2a97202b3d249208c9cda7f) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-04-24x86/tdx: Preserve shared bit on mprotect()Kirill A. Shutemov
The TDX guest platform takes one bit from the physical address to indicate if the page is shared (accessible by VMM). This bit is not part of the physical_mask and is not preserved during mprotect(). As a result, the 'shared' bit is lost during mprotect() on shared mappings. _COMMON_PAGE_CHG_MASK specifies which PTE bits need to be preserved during modification. AMD includes 'sme_me_mask' in the define to preserve the 'encrypt' bit. To cover both Intel and AMD cases, include 'cc_mask' in _COMMON_PAGE_CHG_MASK instead of 'sme_me_mask'. Reported-and-tested-by: Chris Oo <cho@microsoft.com> Fixes: 41394e33f3a0 ("x86/tdx: Extend the confidential computing API to support TDX guests") Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20240424082035.4092071-1-kirill.shutemov%40linux.intel.com
2024-04-24gpio: tegra186: Fix tegra186_gpio_is_accessible() checkPrathamesh Shete
The controller has several register bits describing access control information for a given GPIO pin. When SCR_SEC_[R|W]EN is unset, it means we have full read/write access to all the registers for given GPIO pin. When SCR_SEC[R|W]EN is set, it means we need to further check the accompanying SCR_SEC_G1[R|W] bit to determine read/write access to all the registers for given GPIO pin. This check was previously declaring that a GPIO pin was accessible only if either of the following conditions were met: - SCR_SEC_REN + SCR_SEC_WEN both set or - SCR_SEC_REN + SCR_SEC_WEN both set and SCR_SEC_G1R + SCR_SEC_G1W both set Update the check to properly handle cases where only one of SCR_SEC_REN or SCR_SEC_WEN is set. Fixes: b2b56a163230 ("gpio: tegra186: Check GPIO pin permission before access.") Signed-off-by: Prathamesh Shete <pshete@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20240424095514.24397-1-pshete@nvidia.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-04-24fbdev: fix incorrect address computation in deferred IONam Cao
With deferred IO enabled, a page fault happens when data is written to the framebuffer device. Then driver determines which page is being updated by calculating the offset of the written virtual address within the virtual memory area, and uses this offset to get the updated page within the internal buffer. This page is later copied to hardware (thus the name "deferred IO"). This offset calculation is only correct if the virtual memory area is mapped to the beginning of the internal buffer. Otherwise this is wrong. For example, if users do: mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0xff000); Then the virtual memory area will mapped at offset 0xff000 within the internal buffer. This offset 0xff000 is not accounted for, and wrong page is updated. Correct the calculation by using vmf->pgoff instead. With this change, the variable "offset" will no longer hold the exact offset value, but it is rounded down to multiples of PAGE_SIZE. But this is still correct, because this variable is only used to calculate the page offset. Reported-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Closes: https://lore.kernel.org/linux-fbdev/271372d6-e665-4e7f-b088-dee5f4ab341a@oracle.com Fixes: 56c134f7f1b5 ("fbdev: Track deferred-I/O pages in pageref struct") Cc: <stable@vger.kernel.org> Signed-off-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Tested-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240423115053.4490-1-namcao@linutronix.de
2024-04-24x86/cpu: Fix check for RDPKRU in __show_regs()David Kaplan
cpu_feature_enabled(X86_FEATURE_OSPKE) does not necessarily reflect whether CR4.PKE is set on the CPU. In particular, they may differ on non-BSP CPUs before setup_pku() is executed. In this scenario, RDPKRU will #UD causing the system to hang. Fix by checking CR4 for PKE enablement which is always correct for the current CPU. The scenario happens by inserting a WARN* before setup_pku() in identiy_cpu() or some other diagnostic which would lead to calling __show_regs(). [ bp: Massage commit message. ] Signed-off-by: David Kaplan <david.kaplan@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240421191728.32239-1-bp@kernel.org
2024-04-24x86/CPU/AMD: Add models 0x10-0x1f to the Zen5 rangeWenkuan Wang
Add some more Zen5 models. Fixes: 3e4147f33f8b ("x86/CPU/AMD: Add X86_FEATURE_ZEN5") Signed-off-by: Wenkuan Wang <Wenkuan.Wang@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20240423144111.1362-1-bp@kernel.org
2024-04-24net: phy: mediatek-ge-soc: follow netdev LED trigger semanticsDaniel Golle
Only blink if the link is up on a LED which is programmed to also indicate link-status. Otherwise, if both LEDs are in use to indicate different speeds, the resulting blinking being inverted on LEDs which aren't switched on at a specific speed is quite counter-intuitive. Also make sure that state left behind by reset or the bootloader is recognized correctly including the half-duplex and full-duplex bits as well as the (unsupported by Linux netdev trigger semantics) link-down bit. Fixes: c66937b0f8db ("net: phy: mediatek-ge-soc: support PHY LEDs") Signed-off-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-24net: gtp: Fix Use-After-Free in gtp_dellinkHyunwoo Kim
Since call_rcu, which is called in the hlist_for_each_entry_rcu traversal of gtp_dellink, is not part of the RCU read critical section, it is possible that the RCU grace period will pass during the traversal and the key will be free. To prevent this, it should be changed to hlist_for_each_entry_safe. Fixes: 94dc550a5062 ("gtp: fix an use-after-free in ipv4_pdp_find()") Signed-off-by: Hyunwoo Kim <v4bel@theori.io> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-24smb: client: Fix struct_group() usage in __packed structsGustavo A. R. Silva
Use struct_group_attr() in __packed structs, instead of struct_group(). Below you can see the pahole output before/after changes: pahole -C smb2_file_network_open_info fs/smb/client/smb2ops.o struct smb2_file_network_open_info { union { struct { __le64 CreationTime; /* 0 8 */ __le64 LastAccessTime; /* 8 8 */ __le64 LastWriteTime; /* 16 8 */ __le64 ChangeTime; /* 24 8 */ __le64 AllocationSize; /* 32 8 */ __le64 EndOfFile; /* 40 8 */ __le32 Attributes; /* 48 4 */ }; /* 0 56 */ struct { __le64 CreationTime; /* 0 8 */ __le64 LastAccessTime; /* 8 8 */ __le64 LastWriteTime; /* 16 8 */ __le64 ChangeTime; /* 24 8 */ __le64 AllocationSize; /* 32 8 */ __le64 EndOfFile; /* 40 8 */ __le32 Attributes; /* 48 4 */ } network_open_info; /* 0 56 */ }; /* 0 56 */ __le32 Reserved; /* 56 4 */ /* size: 60, cachelines: 1, members: 2 */ /* last cacheline: 60 bytes */ } __attribute__((__packed__)); pahole -C smb2_file_network_open_info fs/smb/client/smb2ops.o struct smb2_file_network_open_info { union { struct { __le64 CreationTime; /* 0 8 */ __le64 LastAccessTime; /* 8 8 */ __le64 LastWriteTime; /* 16 8 */ __le64 ChangeTime; /* 24 8 */ __le64 AllocationSize; /* 32 8 */ __le64 EndOfFile; /* 40 8 */ __le32 Attributes; /* 48 4 */ } __attribute__((__packed__)); /* 0 52 */ struct { __le64 CreationTime; /* 0 8 */ __le64 LastAccessTime; /* 8 8 */ __le64 LastWriteTime; /* 16 8 */ __le64 ChangeTime; /* 24 8 */ __le64 AllocationSize; /* 32 8 */ __le64 EndOfFile; /* 40 8 */ __le32 Attributes; /* 48 4 */ } __attribute__((__packed__)) network_open_info; /* 0 52 */ }; /* 0 52 */ __le32 Reserved; /* 52 4 */ /* size: 56, cachelines: 1, members: 2 */ /* last cacheline: 56 bytes */ }; pahole -C smb_com_open_rsp fs/smb/client/cifssmb.o struct smb_com_open_rsp { ... union { struct { __le64 CreationTime; /* 48 8 */ __le64 LastAccessTime; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ __le64 LastWriteTime; /* 64 8 */ __le64 ChangeTime; /* 72 8 */ __le32 FileAttributes; /* 80 4 */ }; /* 48 40 */ struct { __le64 CreationTime; /* 48 8 */ __le64 LastAccessTime; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ __le64 LastWriteTime; /* 64 8 */ __le64 ChangeTime; /* 72 8 */ __le32 FileAttributes; /* 80 4 */ } common_attributes; /* 48 40 */ }; /* 48 40 */ ... /* size: 111, cachelines: 2, members: 14 */ /* last cacheline: 47 bytes */ } __attribute__((__packed__)); pahole -C smb_com_open_rsp fs/smb/client/cifssmb.o struct smb_com_open_rsp { ... union { struct { __le64 CreationTime; /* 48 8 */ __le64 LastAccessTime; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ __le64 LastWriteTime; /* 64 8 */ __le64 ChangeTime; /* 72 8 */ __le32 FileAttributes; /* 80 4 */ } __attribute__((__packed__)); /* 48 36 */ struct { __le64 CreationTime; /* 48 8 */ __le64 LastAccessTime; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ __le64 LastWriteTime; /* 64 8 */ __le64 ChangeTime; /* 72 8 */ __le32 FileAttributes; /* 80 4 */ } __attribute__((__packed__)) common_attributes; /* 48 36 */ }; /* 48 36 */ ... /* size: 107, cachelines: 2, members: 14 */ /* last cacheline: 43 bytes */ } __attribute__((__packed__)); pahole -C FILE_ALL_INFO fs/smb/client/cifssmb.o typedef struct { union { struct { __le64 CreationTime; /* 0 8 */ __le64 LastAccessTime; /* 8 8 */ __le64 LastWriteTime; /* 16 8 */ __le64 ChangeTime; /* 24 8 */ __le32 Attributes; /* 32 4 */ }; /* 0 40 */ struct { __le64 CreationTime; /* 0 8 */ __le64 LastAccessTime; /* 8 8 */ __le64 LastWriteTime; /* 16 8 */ __le64 ChangeTime; /* 24 8 */ __le32 Attributes; /* 32 4 */ } common_attributes; /* 0 40 */ }; /* 0 40 */ ... /* size: 113, cachelines: 2, members: 17 */ /* last cacheline: 49 bytes */ } __attribute__((__packed__)) FILE_ALL_INFO; pahole -C FILE_ALL_INFO fs/smb/client/cifssmb.o typedef struct { union { struct { __le64 CreationTime; /* 0 8 */ __le64 LastAccessTime; /* 8 8 */ __le64 LastWriteTime; /* 16 8 */ __le64 ChangeTime; /* 24 8 */ __le32 Attributes; /* 32 4 */ } __attribute__((__packed__)); /* 0 36 */ struct { __le64 CreationTime; /* 0 8 */ __le64 LastAccessTime; /* 8 8 */ __le64 LastWriteTime; /* 16 8 */ __le64 ChangeTime; /* 24 8 */ __le32 Attributes; /* 32 4 */ } __attribute__((__packed__)) common_attributes; /* 0 36 */ }; /* 0 36 */ ... /* size: 109, cachelines: 2, members: 17 */ /* last cacheline: 45 bytes */ } __attribute__((__packed__)) FILE_ALL_INFO; Fixes: 0015eb6e1238 ("smb: client, common: fix fortify warnings") Cc: stable@vger.kernel.org Reviewed-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-04-23mei: me: add lunar lake point M DIDAlexander Usyskin
Add Lunar (Point) Lake M device id. Cc: stable@vger.kernel.org Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Link: https://lore.kernel.org/r/20240421135631.223362-1-tomas.winkler@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-23mei: pxp: match against PCI_CLASS_DISPLAY_OTHERDaniele Ceraolo Spurio
The ATS-M class is PCI_CLASS_DISPLAY_OTHER instead of PCI_CLASS_DISPLAY_VGA, so we need to match against that class as well. The matching is still restricted to Intel devices only. Fixes: ceeedd951f8a ("mei: pxp: match without driver name") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Link: https://lore.kernel.org/r/20240421090701.216028-1-tomas.winkler@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-24LoongArch: Fix callchain parse error with kernel tracepoint eventsHuacai Chen
In order to fix perf's callchain parse error for LoongArch, we implement perf_arch_fetch_caller_regs() which fills several necessary registers used for callchain unwinding, including sp, fp, and era. This is similar to the following commits. commit b3eac0265bf6: ("arm: perf: Fix callchain parse error with kernel tracepoint events") commit 5b09a094f2fb: ("arm64: perf: Fix callchain parse error with kernel tracepoint events") commit 9a7e8ec0d4cc: ("riscv: perf: Fix callchain parse error with kernel tracepoint events") Test with commands: perf record -e sched:sched_switch -g --call-graph dwarf perf report Without this patch: Children Self Command Shared Object Symbol ........ ........ ............. ................. .................... 43.41% 43.41% swapper [unknown] [k] 0000000000000000 10.94% 10.94% loong-container [unknown] [k] 0000000000000000 | |--5.98%--0x12006ba38 | |--2.56%--0x12006bb84 | --2.40%--0x12006b6b8 With this patch, callchain can be parsed correctly: Children Self Command Shared Object Symbol ........ ........ ............. ................. .................... 47.57% 47.57% swapper [kernel.vmlinux] [k] __schedule | ---__schedule 26.76% 26.76% loong-container [kernel.vmlinux] [k] __schedule | |--13.78%--0x12006ba38 | | | |--9.19%--__schedule | | | --4.59%--handle_syscall | do_syscall | sys_futex | do_futex | futex_wait | futex_wait_queue_me | hrtimer_start_range_ns | __schedule | |--8.38%--0x12006bb84 | handle_syscall | do_syscall | sys_epoll_pwait | do_epoll_wait | schedule_hrtimeout_range_clock | hrtimer_start_range_ns | __schedule | --4.59%--0x12006b6b8 handle_syscall do_syscall sys_nanosleep hrtimer_nanosleep do_nanosleep hrtimer_start_range_ns __schedule Cc: stable@vger.kernel.org Fixes: b37042b2bb7cd751f0 ("LoongArch: Add perf events support") Reported-by: Youling Tang <tangyouling@kylinos.cn> Suggested-by: Youling Tang <tangyouling@kylinos.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-04-24LoongArch: Fix access error when read fault on a write-only VMAJiantao Shan
As with most architectures, allow handling of read faults in VMAs that have VM_WRITE but without VM_READ (WRITE implies READ). Otherwise, reading before writing a write-only memory will error while reading after writing everything is fine. BTW, move the VM_EXEC judgement before VM_READ/VM_WRITE to make logic a little clearer. Cc: stable@vger.kernel.org Fixes: 09cfefb7fa70c3af01 ("LoongArch: Add memory management") Signed-off-by: Jiantao Shan <shanjiantao@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-04-24LoongArch: Fix a build error due to __tlb_remove_tlb_entry()David Hildenbrand
With LLVM=1 and W=1 we get: ./include/asm-generic/tlb.h:629:10: error: parameter 'ptep' set but not used [-Werror,-Wunused-but-set-parameter] We fixed a similar issue via Arnd in the introducing commit, missed the LoongArch variant. Turns out, there is no need for LoongArch to have a custom variant, so let's just drop it and rely on the asm-generic one. Fixes: 4d5bf0b6183f ("mm/mmu_gather: add tlb_remove_tlb_entries()") Closes: https://lkml.kernel.org/r/CANiq72mQh3O9S4umbvrKBgMMorty48UMwS01U22FR0mRyd3cyQ@mail.gmail.com Reported-by: Miguel Ojeda <ojeda@kernel.org> Reviewed-by: Miguel Ojeda <ojeda@kernel.org> Tested-by: Miguel Ojeda <ojeda@kernel.org> Tested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-04-24LoongArch: Fix Kconfig item and left code related to CRASH_COREBaoquan He
In commit 85fcde402db191b5 ("kexec: split crashkernel reservation code out from crash_core.c"), crashkernel reservation code is split out from crash_core.c, and add CRASH_RESERVE to control it. And also rename each ARCH's <asm/crash_core.h> to <asm/crash_reserve.h> accordingly. But the relevant part in LoongArch is missed. Do it now. Fixes: 85fcde402db1 ("kexec: split crashkernel reservation code out from crash_core.c") Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-04-23Merge tag 'iio-fixes-for-6.9a' of ↵Greg Kroah-Hartman
https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-linus Jonathan writes: IIO: 1st set of fixes for the 6.9 cycle. adi,asdis16475 - Write the correct field in the register when setting the sync mode. bosch,bmp280 - Wrong chip specific data being used for the bme280 in the SPI driver. - Fix that we can't use chip IDs because Bosch reuses them for incompatible devices (some require a padding byte, others don't). maxim,max30102 (dt binding) - Fix incorrect property check to actually match on a device from the binding rather than a completely different one due to a typo. memsic,mxc4005 - Fix wrong masking of interrupt register accidentally disabling temperature compensation. Also hammer initial state to 0 as it's not documented if interrupts are masked after reset. - Explicit reset on probe() and resume() as some devices do not power up correctly without a reset. * tag 'iio-fixes-for-6.9a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio: iio:imu: adis16475: Fix sync mode setting iio: accel: mxc4005: Reset chip on probe() and resume() iio: accel: mxc4005: Interrupt handling fixes dt-bindings: iio: health: maxim,max30102: fix compatible check iio: pressure: Fixes SPI support for BMP3xx devices iio: pressure: Fixes BME280 SPI driver data
2024-04-23workqueue: The default node_nr_active should have its max set to max_activeTejun Heo
The default nna (node_nr_active) is used when the pool isn't tied to a specific NUMA node. This can happen in the following cases: 1. On NUMA, if per-node pwq init failure and the fallback pwq is used. 2. On NUMA, if a pool is configured to span multiple nodes. 3. On single node setups. 5797b1c18919 ("workqueue: Implement system-wide nr_active enforcement for unbound workqueues") set the default nna->max to min_active because only #1 was being considered. For #2 and #3, using min_active means that the max concurrency in normal operation is pushed down to min_active which is currently 8, which can obviously lead to performance issues. exact value nna->max is set to doesn't really matter. #2 can only happen if the workqueue is intentionally configured to ignore NUMA boundaries and there's no good way to distribute max_active in this case. #3 is the default behavior on single node machines. Let's set it the default nna->max to max_active. This fixes the artificially lowered concurrency problem on single node machines and shouldn't hurt anything for other cases. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Fixes: 5797b1c18919 ("workqueue: Implement system-wide nr_active enforcement for unbound workqueues") Link: https://lore.kernel.org/dm-devel/20240410084531.2134621-1-shinichiro.kawasaki@wdc.com/ Signed-off-by: Tejun Heo <tj@kernel.org>
2024-04-23drm/amdgpu/mes: fix use-after-free issueJack Xiao
Delete fence fallback timer to fix the ramdom use-after-free issue. v2: move to amdgpu_mes.c Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-23drm/amdgpu/sdma5.2: use legacy HDP flush for SDMA2/3Alex Deucher
This avoids a potential conflict with firmwares with the newer HDP flush mechanism. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-04-23drm/amdgpu: Fix the ring buffer size for queue VM flushPrike Liang
Here are the corrections needed for the queue ring buffer size calculation for the following cases: - Remove the KIQ VM flush ring usage. - Add the invalidate TLBs packet for gfx10 and gfx11 queue. - There's no VM flush and PFP sync, so remove the gfx9 real ring and compute ring buffer usage. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-23drm/amdkfd: Add VRAM accounting for SVM migrationMukul Joshi
Do VRAM accounting when doing migrations to vram to make sure there is enough available VRAM and migrating to VRAM doesn't evict other possible non-unified memory BOs. If migrating to VRAM fails, driver can fall back to using system memory seamlessly. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-23drm/amd/pm: Restore config space after resetLijo Lazar
During mode-2 reset, pci config space registers are affected at device side. However, certain platforms have switches which assign virtual BAR addresses and returns the same even after device is reset. This affects pci_restore_state() as it doesn't issue another config write, if the value read is same as the saved value. Add a workaround to write saved config space values from driver side. Presently, these switches are in platforms with SMU v13.0.6 SOCs, hence restrict the workaround only to those. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-23drm/amdgpu/umsch: don't execute umsch test when GPU is in reset/suspendLang Yu
umsch test needs full GPU functionality(e.g., VM update, TLB flush, possibly buffer moving under memory pressure) which may be not ready under these states. Just skip it to avoid potential issues. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-04-23drm/amdkfd: Fix rescheduling of restore workerFelix Kuehling
Handle the case that the restore worker was already scheduled by another eviction while the restore was in progress. Fixes: 9a1c1339abf9 ("drm/amdkfd: Run restore_workers on freezable WQs") Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Tested-by: Yunxiang Li <Yunxiang.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-04-23drm/amdgpu: Update BO eviction prioritiesFelix Kuehling
Make SVM BOs more likely to get evicted than other BOs. These BOs opportunistically use available VRAM, but can fall back relatively seamlessly to system memory. It also avoids SVM migrations evicting other, more important BOs as they will evict other SVM allocations first. Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Mukul Joshi <mukul.joshi@amd.com> Tested-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-23drm/amdgpu/vpe: fix vpe dpm setup failedPeyton Lee
The vpe dpm settings should be done before firmware is loaded. Otherwise, the frequency cannot be successfully raised. Signed-off-by: Peyton Lee <peytolee@amd.com> Reviewed-by: Lang Yu <lang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-23drm/amdgpu: Assign correct bits for SDMA HDP flushLijo Lazar
HDP Flush request bit can be kept unique per AID, and doesn't need to be unique SOC-wide. Assign only bits 10-13 for SDMA v4.4.2. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-04-23drm/amdgpu/pm: Remove gpu_od if it's an empty directoryMa Jun
gpu_od should be removed if it's an empty directory Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reported-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-04-23drm/amdkfd: make sure VM is ready for updating operationsLang Yu
When page table BOs were evicted but not validated before updating page tables, VM is still in evicting state, amdgpu_vm_update_range returns -EBUSY and restore_process_worker runs into a dead loop. v2: Split the BO validation and page table update into two separate loops in amdgpu_amdkfd_restore_process_bos. (Felix) 1.Validate BOs 2.Validate VM (and DMABuf attachments) 3.Update page tables for the BOs validated above Fixes: 50661eb1a2c8 ("drm/amdgpu: Auto-validate DMABuf imports in compute VMs") Signed-off-by: Lang Yu <Lang.Yu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-04-23drm/amdgpu: Fix leak when GPU memory allocation failsMukul Joshi
Free the sync object if the memory allocation fails for any reason. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-04-23drm/amdkfd: Fix eviction fence handlingFelix Kuehling
Handle case that dma_fence_get_rcu_safe returns NULL. If restore work is already scheduled, only update its timer. The same work item cannot be queued twice, so undo the extra queue eviction. Fixes: 9a1c1339abf9 ("drm/amdkfd: Run restore_workers on freezable WQs") Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Tested-by: Gang BA <Gang.Ba@amd.com> Reviewed-by: Gang BA <Gang.Ba@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-04-23drm/amd/display: Set color_mgmt_changed to true on unsuspendJoshua Ashton
Otherwise we can end up with a frame on unsuspend where color management is not applied when userspace has not committed themselves. Fixes re-applying color management on Steam Deck/Gamescope on S3 resume. Signed-off-by: Joshua Ashton <joshua@froggi.es> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-23tcp: Fix Use-After-Free in tcp_ao_connect_initHyunwoo Kim
Since call_rcu, which is called in the hlist_for_each_entry_rcu traversal of tcp_ao_connect_init, is not part of the RCU read critical section, it is possible that the RCU grace period will pass during the traversal and the key will be free. To prevent this, it should be changed to hlist_for_each_entry_safe. Fixes: 7c2ffaf21bd6 ("net/tcp: Calculate TCP-AO traffic keys") Signed-off-by: Hyunwoo Kim <v4bel@theori.io> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Dmitry Safonov <0x7f454c46@gmail.com> Link: https://lore.kernel.org/r/ZiYu9NJ/ClR8uSkH@v4bel-B760M-AORUS-ELITE-AX Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-23net: usb: ax88179_178a: stop lying about skb->truesizeEric Dumazet
Some usb drivers try to set small skb->truesize and break core networking stacks. In this patch, I removed one of the skb->truesize overide. I also replaced one skb_clone() by an allocation of a fresh and small skb, to get minimally sized skbs, like we did in commit 1e2c61172342 ("net: cdc_ncm: reduce skb truesize in rx path") Fixes: f8ebb3ac881b ("net: usb: ax88179_178a: Fix packet receiving") Reported-by: shironeko <shironeko@tesaguri.club> Closes: https://lore.kernel.org/netdev/c110f41a0d2776b525930f213ca9715c@tesaguri.club/ Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jose Alonso <joalonsof@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240421193828.1966195-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-23ipv4: check for NULL idev in ip_route_use_hint()Eric Dumazet
syzbot was able to trigger a NULL deref in fib_validate_source() in an old tree [1]. It appears the bug exists in latest trees. All calls to __in_dev_get_rcu() must be checked for a NULL result. [1] general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 2 PID: 3257 Comm: syz-executor.3 Not tainted 5.10.0-syzkaller #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 RIP: 0010:fib_validate_source+0xbf/0x15a0 net/ipv4/fib_frontend.c:425 Code: 18 f2 f2 f2 f2 42 c7 44 20 23 f3 f3 f3 f3 48 89 44 24 78 42 c6 44 20 27 f3 e8 5d 88 48 fc 4c 89 e8 48 c1 e8 03 48 89 44 24 18 <42> 80 3c 20 00 74 08 4c 89 ef e8 d2 15 98 fc 48 89 5c 24 10 41 bf RSP: 0018:ffffc900015fee40 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88800f7a4000 RCX: ffff88800f4f90c0 RDX: 0000000000000000 RSI: 0000000004001eac RDI: ffff8880160c64c0 RBP: ffffc900015ff060 R08: 0000000000000000 R09: ffff88800f7a4000 R10: 0000000000000002 R11: ffff88800f4f90c0 R12: dffffc0000000000 R13: 0000000000000000 R14: 0000000000000000 R15: ffff88800f7a4000 FS: 00007f938acfe6c0(0000) GS:ffff888058c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f938acddd58 CR3: 000000001248e000 CR4: 0000000000352ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ip_route_use_hint+0x410/0x9b0 net/ipv4/route.c:2231 ip_rcv_finish_core+0x2c4/0x1a30 net/ipv4/ip_input.c:327 ip_list_rcv_finish net/ipv4/ip_input.c:612 [inline] ip_sublist_rcv+0x3ed/0xe50 net/ipv4/ip_input.c:638 ip_list_rcv+0x422/0x470 net/ipv4/ip_input.c:673 __netif_receive_skb_list_ptype net/core/dev.c:5572 [inline] __netif_receive_skb_list_core+0x6b1/0x890 net/core/dev.c:5620 __netif_receive_skb_list net/core/dev.c:5672 [inline] netif_receive_skb_list_internal+0x9f9/0xdc0 net/core/dev.c:5764 netif_receive_skb_list+0x55/0x3e0 net/core/dev.c:5816 xdp_recv_frames net/bpf/test_run.c:257 [inline] xdp_test_run_batch net/bpf/test_run.c:335 [inline] bpf_test_run_xdp_live+0x1818/0x1d00 net/bpf/test_run.c:363 bpf_prog_test_run_xdp+0x81f/0x1170 net/bpf/test_run.c:1376 bpf_prog_test_run+0x349/0x3c0 kernel/bpf/syscall.c:3736 __sys_bpf+0x45c/0x710 kernel/bpf/syscall.c:5115 __do_sys_bpf kernel/bpf/syscall.c:5201 [inline] __se_sys_bpf kernel/bpf/syscall.c:5199 [inline] __x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:5199 Fixes: 02b24941619f ("ipv4: use dst hint for ipv4 list receive") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20240421184326.1704930-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-23net: fix sk_memory_allocated_{add|sub} vs softirqsEric Dumazet
Jonathan Heathcote reported a regression caused by blamed commit on aarch64 architecture. x86 happens to have irq-safe __this_cpu_add_return() and __this_cpu_sub(), but this is not generic. I think my confusion came from "struct sock" argument, because these helpers are called with a locked socket. But the memory accounting is per-proto (and per-cpu after the blamed commit). We might cleanup these helpers later to directly accept a "struct proto *proto" argument. Switch to this_cpu_add_return() and this_cpu_xchg() operations, and get rid of preempt_disable()/preempt_enable() pairs. Fast path becomes a bit faster as a result :) Many thanks to Jonathan Heathcote for his awesome report and investigations. Fixes: 3cd3399dd7a8 ("net: implement per-cpu reserves for memory_allocated") Reported-by: Jonathan Heathcote <jonathan.heathcote@bbc.co.uk> Closes: https://lore.kernel.org/netdev/VI1PR01MB42407D7947B2EA448F1E04EFD10D2@VI1PR01MB4240.eurprd01.prod.exchangelabs.com/ Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Link: https://lore.kernel.org/r/20240421175248.1692552-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-24regulator: change devm_regulator_get_enable_optional() stub to return OkMatti Vaittinen
The devm_regulator_get_enable_optional() should be a 'call and forget' API, meaning, when it is used to enable the regulators, the API does not provide a handle to do any further control of the regulators. It gives no real benefit to return an error from the stub if CONFIG_REGULATOR is not set. On the contrary, returning an error is causing problems to drivers when hardware is such it works out just fine with no regulator control. Returning an error forces drivers to specifically handle the case where CONFIG_REGULATOR is not set, making the mere existence of the stub questionalble. Change the stub implementation for the devm_regulator_get_enable_optional() to return Ok so drivers do not separately handle the case where the CONFIG_REGULATOR is not set. Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com> Fixes: da279e6965b3 ("regulator: Add devm helpers for get and enable") Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/ZiedtOE00Zozd3XO@fedora Signed-off-by: Mark Brown <broonie@kernel.org>
2024-04-23Revert "NFSD: Convert the callback workqueue to use delayed_work"Chuck Lever
This commit was a pre-requisite for commit c1ccfcf1a9bf ("NFSD: Reschedule CB operations when backchannel rpc_clnt is shut down"), which has already been reverted. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-04-23Revert "NFSD: Reschedule CB operations when backchannel rpc_clnt is shut down"Chuck Lever
The reverted commit attempted to enable NFSD to retransmit pending callback operations if an NFS client disconnects, but unintentionally introduces a hazardous behavior regression if the client becomes permanently unreachable while callback operations are still pending. A disconnect can occur due to network partition or if the NFS server needs to force the NFS client to retransmit (for example, if a GSS window under-run occurs). Reverting the commit will make NFSD behave the same as it did in v6.8 and before. Pending callback operations are permanently lost if the client connection is terminated before the client receives them. For some callback operations, this loss is not harmful. However, for CB_RECALL, the loss means a delegation might be revoked unnecessarily. For CB_OFFLOAD, pending COPY operations will never complete unless the NFS client subsequently sends an OFFLOAD_STATUS operation, which the Linux NFS client does not currently implement. These issues still need to be addressed somehow. Reported-by: Dai Ngo <dai.ngo@oracle.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=218735 Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-04-23usb: typec: qcom-pmic: fix pdphy start() error handlingJohan Hovold
Move disabling of the vdd-pdphy supply to the start() function which enabled it for symmetry and to make sure that it is disabled as intended in all error paths of pmic_typec_pdphy_reset() (i.e. not just when qcom_pmic_typec_pdphy_enable() fails). Cc: stable+noautosel@kernel.org # Not needed in any stable release, just a minor bugfix Fixes: a4422ff22142 ("usb: typec: qcom: Add Qualcomm PMIC Type-C driver") Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20240418145730.4605-3-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-23usb: typec: qcom-pmic: fix use-after-free on late probe errorsJohan Hovold
Make sure to stop and deregister the port in case of late probe errors to avoid use-after-free issues when the underlying memory is released by devres. Fixes: a4422ff22142 ("usb: typec: qcom: Add Qualcomm PMIC Type-C driver") Cc: stable@vger.kernel.org # 6.5 Cc: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20240418145730.4605-2-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-23usb: gadget: f_fs: Fix a race condition when processing setup packets.Chris Wulff
If the USB driver passes a pointer into the TRB buffer for creq, this buffer can be overwritten with the status response as soon as the event is queued. This can make the final check return USB_GADGET_DELAYED_STATUS when it shouldn't. Instead use the stored wLength. Fixes: 4d644abf2569 ("usb: gadget: f_fs: Only return delayed status when len is 0") Cc: stable <stable@kernel.org> Signed-off-by: Chris Wulff <chris.wulff@biamp.com> Link: https://lore.kernel.org/r/CO1PR17MB5419BD664264A558B2395E28E1112@CO1PR17MB5419.namprd17.prod.outlook.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-23USB: core: Fix access violation during port device removalAlan Stern
Testing with KASAN and syzkaller revealed a bug in port.c:disable_store(): usb_hub_to_struct_hub() can return NULL if the hub that the port belongs to is concurrently removed, but the function does not check for this possibility before dereferencing the returned value. It turns out that the first dereference is unnecessary, since hub->intfdev is the parent of the port device, so it can be changed easily. Adding a check for hub == NULL prevents further problems. The same bug exists in the disable_show() routine, and it can be fixed the same way. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-and-tested-by: Yue Sun <samsun1006219@gmail.com> Reported-by: xingwei lee <xrivendell7@gmail.com> Link: https://lore.kernel.org/linux-usb/CAEkJfYON+ry7xPx=AiLR9jzUNT+i_Va68ACajOC3HoacOfL1ig@mail.gmail.com/ Fixes: f061f43d7418 ("usb: hub: port: add sysfs entry to switch port power") CC: Michael Grzeschik <m.grzeschik@pengutronix.de> CC: stable@vger.kernel.org Link: https://lore.kernel.org/r/393aa580-15a5-44ca-ad3b-6462461cd313@rowland.harvard.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-23usb: dwc3: core: Prevent phy suspend during initThinh Nguyen
GUSB3PIPECTL.SUSPENDENABLE and GUSB2PHYCFG.SUSPHY should be cleared during initialization. Suspend during initialization can result in undefined behavior due to clock synchronization failure, which often seen as core soft reset timeout. The programming guide recommended these bits to be cleared during initialization for DWC_usb3.0 version 1.94 and above (along with DWC_usb31 and DWC_usb32). The current check in the driver does not account if it's set by default setting from coreConsultant. This is especially the case for DRD when switching mode to ensure the phy clocks are available to change mode. Depending on the platforms/design, some may be affected more than others. This is noted in the DWC_usb3x programming guide under the above registers. Let's just disable them during driver load and mode switching. Restore them when the controller initialization completes. Note that some platforms workaround this issue by disabling phy suspend through "snps,dis_u3_susphy_quirk" and "snps,dis_u2_susphy_quirk" when they should not need to. Cc: stable@vger.kernel.org Fixes: 9ba3aca8fe82 ("usb: dwc3: Disable phy suspend after power-on reset") Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://lore.kernel.org/r/20da4e5a0c4678c9587d3da23f83bdd6d77353e9.1713394973.git.Thinh.Nguyen@synopsys.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-23usb: xhci-plat: Don't include xhci.hThinh Nguyen
The xhci_plat.h should not need to include the entire xhci.h header. This can cause redefinition in dwc3 if it selectively includes some xHCI definitions. This is a prerequisite change for a fix to disable suspend during initialization for dwc3. Cc: stable@vger.kernel.org Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://lore.kernel.org/r/310acfa01c957a10d9feaca3f7206269866ba2eb.1713394973.git.Thinh.Nguyen@synopsys.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-23usb: gadget: uvc: use correct buffer size when parsing configfs listsIvan Avdeev
This commit fixes uvc gadget support on 32-bit platforms. Commit 0df28607c5cb ("usb: gadget: uvc: Generalise helper functions for reuse") introduced a helper function __uvcg_iter_item_entries() to aid with parsing lists of items on configfs attributes stores. This function is a generalization of another very similar function, which used a stack-allocated temporary buffer of fixed size for each item in the list and used the sizeof() operator to check for potential buffer overruns. The new function was changed to allocate the now variably sized temp buffer on heap, but wasn't properly updated to also check for max buffer size using the computed size instead of sizeof() operator. As a result, the maximum item size was 7 (plus null terminator) on 64-bit platforms, and 3 on 32-bit ones. While 7 is accidentally just barely enough, 3 is definitely too small for some of UVC configfs attributes. For example, dwFrameInteval, specified in 100ns units, usually has 6-digit item values, e.g. 166666 for 60fps. Cc: stable@vger.kernel.org Fixes: 0df28607c5cb ("usb: gadget: uvc: Generalise helper functions for reuse") Signed-off-by: Ivan Avdeev <me@provod.works> Link: https://lore.kernel.org/r/20240413150124.1062026-1-me@provod.works Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-23usb: gadget: composite: fix OS descriptors w_value logicPeter Korsgaard
The OS descriptors logic had the high/low byte of w_value inverted, causing the extended properties to not be accessible for interface != 0. >From the Microsoft documentation: https://learn.microsoft.com/en-us/windows-hardware/drivers/usbcon/microsoft-os-1-0-descriptors-specification OS_Desc_CompatID.doc (w_index = 0x4): - wValue: High Byte = InterfaceNumber. InterfaceNumber is set to the number of the interface or function that is associated with the descriptor, typically 0x00. Because a device can have only one extended compat ID descriptor, it should ignore InterfaceNumber, regardless of the value, and simply return the descriptor. Low Byte = 0. PageNumber is used to retrieve descriptors that are larger than 64 KB. The header section is 16 bytes, so PageNumber is set to 0 for this request. We currently do not support >64KB compat ID descriptors, so verify that the low byte is 0. OS_Desc_Ext_Prop.doc (w_index = 0x5): - wValue: High byte = InterfaceNumber. The high byte of wValue is set to the number of the interface or function that is associated with the descriptor. Low byte = PageNumber. The low byte of wValue is used to retrieve descriptors that are larger than 64 KB. The header section is 10 bytes, so PageNumber is set to 0 for this request. We also don't support >64KB extended properties, so verify that the low byte is 0 and use the high byte for the interface number. Fixes: 37a3a533429e ("usb: gadget: OS Feature Descriptors support") Cc: stable <stable@kernel.org> Signed-off-by: Peter Korsgaard <peter@korsgaard.com> Link: https://lore.kernel.org/r/20240404100635.3215340-1-peter@korsgaard.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-23usb: gadget: f_fs: Fix race between aio_cancel() and AIO request completeWesley Cheng
FFS based applications can utilize the aio_cancel() callback to dequeue pending USB requests submitted to the UDC. There is a scenario where the FFS application issues an AIO cancel call, while the UDC is handling a soft disconnect. For a DWC3 based implementation, the callstack looks like the following: DWC3 Gadget FFS Application dwc3_gadget_soft_disconnect() ... --> dwc3_stop_active_transfers() --> dwc3_gadget_giveback(-ESHUTDOWN) --> ffs_epfile_async_io_complete() ffs_aio_cancel() --> usb_ep_free_request() --> usb_ep_dequeue() There is currently no locking implemented between the AIO completion handler and AIO cancel, so the issue occurs if the completion routine is running in parallel to an AIO cancel call coming from the FFS application. As the completion call frees the USB request (io_data->req) the FFS application is also referencing it for the usb_ep_dequeue() call. This can lead to accessing a stale/hanging pointer. commit b566d38857fc ("usb: gadget: f_fs: use io_data->status consistently") relocated the usb_ep_free_request() into ffs_epfile_async_io_complete(). However, in order to properly implement locking to mitigate this issue, the spinlock can't be added to ffs_epfile_async_io_complete(), as usb_ep_dequeue() (if successfully dequeuing a USB request) will call the function driver's completion handler in the same context. Hence, leading into a deadlock. Fix this issue by moving the usb_ep_free_request() back to ffs_user_copy_worker(), and ensuring that it explicitly sets io_data->req to NULL after freeing it within the ffs->eps_lock. This resolves the race condition above, as the ffs_aio_cancel() routine will not continue attempting to dequeue a request that has already been freed, or the ffs_user_copy_work() not freeing the USB request until the AIO cancel is done referencing it. This fix depends on commit b566d38857fc ("usb: gadget: f_fs: use io_data->status consistently") Fixes: 2e4c7553cd6f ("usb: gadget: f_fs: add aio support") Cc: stable <stable@kernel.org> # b566d38857fc ("usb: gadget: f_fs: use io_data->status consistently") Signed-off-by: Wesley Cheng <quic_wcheng@quicinc.com> Link: https://lore.kernel.org/r/20240409014059.6740-1-quic_wcheng@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>