git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2023-07-13	pinctrl: amd: Unify debounce handling into amd_pinconf_set()	Mario Limonciello
	Debounce handling is done in two different entry points in the driver. Unify this to make sure that it's always handled the same. Tested-by: Jan Visser <starquake@linuxeverywhere.org> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20230705133005.577-5-mario.limonciello@amd.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2023-07-13	pinctrl: amd: Drop pull up select configuration	Mario Limonciello
	pinctrl-amd currently tries to program bit 19 of all GPIOs to select either a 4kΩ or 8hΩ pull up, but this isn't what bit 19 does. Bit 19 is marked as reserved, even in the latest platforms documentation. Drop this programming functionality. Tested-by: Jan Visser <starquake@linuxeverywhere.org> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20230705133005.577-4-mario.limonciello@amd.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2023-07-13	pinctrl: amd: Use amd_pinconf_set() for all config options	Mario Limonciello
	On ASUS TUF A16 it is reported that the ITE5570 ACPI device connected to GPIO 7 is causing an interrupt storm. This issue doesn't happen on Windows. Comparing the GPIO register configuration between Windows and Linux bit 20 has been configured as a pull up on Windows, but not on Linux. Checking GPIO declaration from the firmware it is clear it should have been a pull up on Linux as well. ``` GpioInt (Level, ActiveLow, Exclusive, PullUp, 0x0000, "\\_SB.GPIO", 0x00, ResourceConsumer, ,) { // Pin list 0x0007 } ``` On Linux amd_gpio_set_config() is currently only used for programming the debounce. Actually the GPIO core calls it with all the arguments that are supported by a GPIO, pinctrl-amd just responds `-ENOTSUPP`. To solve this issue expand amd_gpio_set_config() to support the other arguments amd_pinconf_set() supports, namely `PIN_CONFIG_BIAS_PULL_DOWN`, `PIN_CONFIG_BIAS_PULL_UP`, and `PIN_CONFIG_DRIVE_STRENGTH`. Reported-by: Nik P <npliashechnikov@gmail.com> Reported-by: Nathan Schulte <nmschulte@gmail.com> Reported-by: Friedrich Vock <friedrich.vock@gmx.de> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217336 Reported-by: dridri85@gmail.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217493 Link: https://lore.kernel.org/linux-input/20230530154058.17594-1-friedrich.vock@gmx.de/ Tested-by: Jan Visser <starquake@linuxeverywhere.org> Fixes: 2956b5d94a76 ("pinctrl / gpio: Introduce .set_config() callback for GPIO chips") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20230705133005.577-3-mario.limonciello@amd.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2023-07-13	pinctrl: amd: Only use special debounce behavior for GPIO 0	Mario Limonciello
	It's uncommon to use debounce on any other pin, but technically we should only set debounce to 0 when working off GPIO0. Cc: stable@vger.kernel.org Tested-by: Jan Visser <starquake@linuxeverywhere.org> Fixes: 968ab9261627 ("pinctrl: amd: Detect internal GPIO0 debounce handling") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20230705133005.577-2-mario.limonciello@amd.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2023-07-12	tracing: Stop FORTIFY_SOURCE complaining about stack trace caller	Steven Rostedt (Google)
	The stack_trace event is an event created by the tracing subsystem to store stack traces. It originally just contained a hard coded array of 8 words to hold the stack, and a "size" to know how many entries are there. This is exported to user space as: name: kernel_stack ID: 4 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:int size; offset:8; size:4; signed:1; field:unsigned long caller[8]; offset:16; size:64; signed:0; print fmt: "\t=> %ps\n\t=> %ps\n\t=> %ps\n" "\t=> %ps\n\t=> %ps\n\t=> %ps\n" "\t=> %ps\n\t=> %ps\n",i (void )REC->caller[0], (void )REC->caller[1], (void )REC->caller[2], (void )REC->caller[3], (void )REC->caller[4], (void )REC->caller[5], (void )REC->caller[6], (void )REC->caller[7] Where the user space tracers could parse the stack. The library was updated for this specific event to only look at the size, and not the array. But some older users still look at the array (note, the older code still checks to make sure the array fits inside the event that it read. That is, if only 4 words were saved, the parser would not read the fifth word because it will see that it was outside of the event size). This event was changed a while ago to be more dynamic, and would save a full stack even if it was greater than 8 words. It does this by simply allocating more ring buffer to hold the extra words. Then it copies in the stack via: memcpy(&entry->caller, fstack->calls, size); As the entry is struct stack_entry, that is created by a macro to both create the structure and export this to user space, it still had the caller field of entry defined as: unsigned long caller[8]. When the stack is greater than 8, the FORTIFY_SOURCE code notices that the amount being copied is greater than the source array and complains about it. It has no idea that the source is pointing to the ring buffer with the required allocation. To hide this from the FORTIFY_SOURCE logic, pointer arithmetic is used: ptr = ring_buffer_event_data(event); entry = ptr; ptr += offsetof(typeof(*entry), caller); memcpy(ptr, fstack->calls, size); Link: https://lore.kernel.org/all/20230612160748.4082850-1-svens@linux.ibm.com/ Link: https://lore.kernel.org/linux-trace-kernel/20230712105235.5fc441aa@gandalf.local.home Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Reported-by: Sven Schnelle <svens@linux.ibm.com> Tested-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-07-12	ftrace: Fix possible warning on checking all pages used in ftrace_process_locs()	Zheng Yejian
	As comments in ftrace_process_locs(), there may be NULL pointers in mcount_loc section: > Some architecture linkers will pad between > the different mcount_loc sections of different > object files to satisfy alignments. > Skip any NULL pointers. After commit 20e5227e9f55 ("ftrace: allow NULL pointers in mcount_loc"), NULL pointers will be accounted when allocating ftrace pages but skipped before adding into ftrace pages, this may result in some pages not being used. Then after commit 706c81f87f84 ("ftrace: Remove extra helper functions"), warning may occur at: WARN_ON(pg->next); To fix it, only warn for case that no pointers skipped but pages not used up, then free those unused pages after releasing ftrace_lock. Link: https://lore.kernel.org/linux-trace-kernel/20230712060452.3175675-1-zhengyejian1@huawei.com Cc: stable@vger.kernel.org Fixes: 706c81f87f84 ("ftrace: Remove extra helper functions") Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-07-12	drm/nouveau: bring back blit subchannel for pre nv50 GPUs	Karol Herbst
	1ba6113a90a0 removed a lot of the kernel GPU channel, but method 0x128 was important as otherwise the GPU spams us with `CACHE_ERROR` messages. We use the blit subchannel inside our vblank handling, so we should keep at least this part. v2: Only do it for NV11+ GPUs Closes: https://gitlab.freedesktop.org/drm/nouveau/-/issues/201 Fixes: 4a16dd9d18a0 ("drm/nouveau/kms: switch to drm fbdev helpers") Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230526091052.2169044-1-kherbst@redhat.com
2023-07-12	drm/nouveau/acr: Abort loading ACR if no firmware was found	Karol Herbst
	This fixes a NULL pointer access inside nvkm_acr_oneinit in case necessary firmware files couldn't be loaded. Closes: https://gitlab.freedesktop.org/drm/nouveau/-/issues/212 Fixes: 4b569ded09fd ("drm/nouveau/acr/ga102: initial support") Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230522201838.1496622-1-kherbst@redhat.com
2023-07-12	netdevsim: fix uninitialized data in nsim_dev_trap_fa_cookie_write()	Dan Carpenter
	The simple_write_to_buffer() function is designed to handle partial writes. It returns negatives on error, otherwise it returns the number of bytes that were able to be copied. This code doesn't check the return properly. We only know that the first byte is written, the rest of the buffer might be uninitialized. There is no need to use the simple_write_to_buffer() function. Partial writes are prohibited by the "if (*ppos != 0)" check at the start of the function. Just use memdup_user() and copy the whole buffer. Fixes: d3cbb907ae57 ("netdevsim: add ACL trap reporting cookie as a metadata") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/7c1f950b-3a7d-4252-82a6-876e53078ef7@moroto.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-12	Merge tag 'platform-drivers-x86-v6.5-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Hans de Goede: "Misc small fixes and hw-id additions" * tag 'platform-drivers-x86-v6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: touchscreen_dmi: Add info for the Archos 101 Cesium Educ tablet platform/x86: dell-ddv: Fix mangled list in documentation platform/x86: dell-ddv: Improve error handling platform/x86/amd: pmf: Add new ACPI ID AMDI0103 platform/x86/amd: pmc: Add new ACPI ID AMDI000A platform/x86/amd: pmc: Apply nvme quirk to HP 15s-eq2xxx platform/x86: Move s2idle quirk from thinkpad-acpi to amd-pmc platform/x86: int3472/discrete: set variable skl_int3472_regulator_second_sensor storage-class-specifier to static platform/x86/intel/tpmi: Prevent overflow for cap_offset platform/x86: wmi: Replace open coded guid_parse_and_compare() platform/x86: wmi: Break possible infinite loop when parsing GUID
2023-07-12	Merge tag 'probes-fixes-v6.5-rc1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull probes fixes from Masami Hiramatsu: - Fix fprobe's rethook release issues: - Release rethook after ftrace_ops is unregistered so that the rethook is not accessed after free. - Stop rethook before ftrace_ops is unregistered so that the rethook is NOT used after exiting unregister_fprobe() - Fix eprobe cleanup logic. If it attaches to multiple events and failes to enable one of them, rollback all enabled events correctly. - Fix fprobe to unlock ftrace recursion lock correctly when it missed by another running kprobe. - Cleanup kprobe to remove unnecessary NULL. - Cleanup kprobe to remove unnecessary 0 initializations. * tag 'probes-fixes-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: fprobe: Ensure running fprobe_exit_handler() finished before calling rethook_free() kernel: kprobes: Remove unnecessary ‘0’ values kprobes: Remove unnecessary ‘NULL’ values from correct_ret_addr fprobe: add unlock to match a succeeded ftrace_test_recursion_trylock kernel/trace: Fix cleanup logic of enable_trace_eprobe fprobe: Release rethook after the ftrace_ops is unregistered
2023-07-12	Merge tag 'for-linus-2023071101' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Benjamin Tissoires: - AMD SFH shift-out-of-bounds fix (Basavaraj Natikar) - avoid struct memcpy overrun warning in the hid-hyperv module (Arnd Bergmann) - a quick HID kselftests script fix for our CI to be happy (Benjamin Tissoires) - various fixes and additions of device IDs * tag 'for-linus-2023071101' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: amd_sfh: Fix for shift-out-of-bounds HID: amd_sfh: Rename the float32 variable HID: input: fix mapping for camera access keys HID: logitech-hidpp: Add wired USB id for Logitech G502 Lightspeed HID: nvidia-shield: Pack inner/related declarations in HOSTCMD reports HID: hyperv: avoid struct memcpy overrun warning selftests: hid: fix vmtests.sh not running make headers
2023-07-12	block/mq-deadline: Fix a bug in deadline_from_pos()	Bart Van Assche
	A bug was introduced in deadline_from_pos() while implementing the suggestion to use round_down() in the following code: pos -= bdev_offset_from_zone_start(rq->q->disk->part0, pos); This patch makes deadline_from_pos() use round_down() such that 'pos' is rounded down. Reported-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Closes: https://lore.kernel.org/all/5zthzi3lppvcdp4nemum6qck4gpqbdhvgy4k3qwguhgzxc4quj@amulvgycq67h/ Cc: Christoph Hellwig <hch@lst.de> Cc: Damien Le Moal <dlemoal@kernel.org> Fixes: 0effb390c4ba ("block: mq-deadline: Handle requeued requests correctly") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20230712173344.2994513-1-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-07-13	drm/loongson: Remove a useless check in cursor_plane_atomic_async_check()	Sui Jingfeng
	Because smatch warnings: drivers/gpu/drm/loongson/lsdc_plane.c:199 lsdc_cursor_plane_atomic_async_check() warn: variable dereferenced before check 'state' (see line 180) vim +/state +199 drivers/gpu/drm/loongson/lsdc_plane.c 174 static int lsdc_cursor_plane_atomic_async_check(struct drm_plane plane, 175 struct drm_atomic_state state) 176 { 177 struct drm_plane_state new_state; 178 struct drm_crtc_state crtc_state; 179 180 new_state = drm_atomic_get_new_plane_state(state, plane); ^^^^^ state is dereferenced inside this function 181 182 if (!plane->state \|\| !plane->state->fb) { 183 drm_dbg(plane->dev, "%s: state is NULL\n", plane->name); 184 return -EINVAL; 185 } 186 187 if (new_state->crtc_w != new_state->crtc_h) { 188 drm_dbg(plane->dev, "unsupported cursor size: %ux%u\n", 189 new_state->crtc_w, new_state->crtc_h); 190 return -EINVAL; 191 } 192 193 if (new_state->crtc_w != 64 && new_state->crtc_w != 32) { 194 drm_dbg(plane->dev, "unsupported cursor size: %ux%u\n", 195 new_state->crtc_w, new_state->crtc_h); 196 return -EINVAL; 197 } 198 199 if (state) { ^^^^^ Checked too late! Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/202307100423.rV7D05Uq-lkp@intel.com/ Signed-off-by: Sui Jingfeng <suijingfeng@loongson.cn> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Sui Jingfeng <sui.jingfeng@linux.dev> Link: https://patchwork.freedesktop.org/patch/msgid/20230710102411.257970-1-suijingfeng@loongson.cn
2023-07-12	RISC-V: Don't include Zicsr or Zifencei in I from ACPI	Palmer Dabbelt
	ACPI ISA strings are based on a specification after Zicsr and Zifencei were split out of I, so we shouldn't be treating them as part of I. We haven't release an ACPI-based kernel yet, so we don't need to worry about compatibility with the old ISA strings. Fixes: 07edc32779e3 ("RISC-V: always report presence of extensions formerly part of the base ISA") Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Sunil V L <sunilvl@ventanamicro.com> Link: https://lore.kernel.org/r/20230711224600.10879-1-palmer@rivosinc.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-07-12	nvme: ensure disabling pairs with unquiesce	Keith Busch
	If any error handling that disables the controller fails to queue the reset work, like if the state changed to disconnected inbetween, then the failed teardown needs to unquiesce the queues since it's no longer paired with reset_work. Just make sure that the controller can be put into a resetting state prior to starting the disable so that no other handling can change the queue states while recovery is happening. Reported-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2023-07-12	nvme-fc: fix race between error recovery and creating association	Michael Liang
	There is a small race window between nvme-fc association creation and error recovery. Fix this race condition by protecting accessing to controller state and ASSOC_FAILED flag under nvme-fc controller lock. Signed-off-by: Michael Liang <mliang@purestorage.com> Reviewed-by: Caleb Sander <csander@purestorage.com> Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2023-07-12	nvme-fc: return non-zero status code when fails to create association	Michael Liang
	Return non-zero status code(-EIO) when needed, so re-connecting or deleting controller will be triggered properly. Signed-off-by: Michael Liang <mliang@purestorage.com> Reviewed-by: Caleb Sander <csander@purestorage.com> Reviewed-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2023-07-12	drm/i915/mtl: Update cache coherency setting for context structure	Zhanjun Dong
	As context structure is shared memory for CPU/GPU, Wa_22016122933 is needed for this memory block as well. Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com> CC: Fei Yang <fei.yang@intel.com> Reviewed-by: Fei Yang <fei.yang@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230706174704.177929-1-zhanjun.dong@intel.com
2023-07-12	drm/amd/display: dc.h: eliminate kernel-doc warnings	Randy Dunlap
	Quash 175 kernel-doc warnings in dc.h by unmarking 2 struct comments as containing kernel-doc notation and by spelling one struct field correctly in a kernel-doc comment. Fixes: 1682bd1a6b5f ("drm/amd/display: Expand kernel doc for DC") Fixes: ea76895ffab1 ("drm/amd/display: Document pipe split policy") Fixes: f6ae69f49fcf ("drm/amd/display: Include surface of unaffected streams") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Leo Li <sunpeng.li@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: dri-devel@lists.freedesktop.org
2023-07-12	drm/amdgpu: Rename to amdgpu_vm_tlb_seq_struct	Luben Tuikov
	Rename struct amdgpu_vm_tlb_seq_cb {...} to struct amdgpu_vm_tlb_seq_struct {...}, so as to not conflict with documentation processing tools. Of course, C has no problem with this. Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/b5ebc891-ee63-1638-0377-7b512d34b823@infradead.org Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-12	drm/amdkfd: Fix stack size in 'amdgpu_amdkfd_unmap_hiq'	Srinivasan Shanmugam
	Dynamically allocate large local variable instead of putting it onto the stack to avoid exceeding the stack size: drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c: In function ‘amdgpu_amdkfd_unmap_hiq’: drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c:868:1: warning: the frame size of 1280 bytes is larger than 1024 bytes [-Wframe-larger-than=] Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202307080505.V12qS0oz-lkp@intel.com Suggested-by: Guchun Chen <guchun.chen@amd.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-12	drm/amdkfd: report dispatch id always saved in ttmps after gc9.4.2	Jonathan Kim
	The feature to save the dispatch ID in trap temporaries 6 & 7 on context save is unconditionally enabled during MQD initialization. Now that TTMPs are always setup regardless of debug mode for GC 9.4.3, we should report that the dispatch ID is always available for debug/trap handling. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-12	drm/amd: Align SMU11 SMU_MSG_OverridePcieParameters implementation with SMU13	Mario Limonciello
	SMU13 overrides dynamic PCIe lane width and dynamic speed by when on certain hosts. commit 38e4ced80479 ("drm/amd/pm: conditionally disable pcie lane switching for some sienna_cichlid SKUs") worked around this issue by setting up certain SKUs to set up certain limits, but the same fundamental problem with those hosts affects all SMU11 implmentations as well, so align the SMU11 and SMU13 driver handling. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-12	drm/amd: Align SMU11 SMU_MSG_OverridePcieParameters implementation with SMU13	Mario Limonciello
	SMU13 overrides dynamic PCIe lane width and dynamic speed by when on certain hosts. commit 38e4ced80479 ("drm/amd/pm: conditionally disable pcie lane switching for some sienna_cichlid SKUs") worked around this issue by setting up certain SKUs to set up certain limits, but the same fundamental problem with those hosts affects all SMU11 implmentations as well, so align the SMU11 and SMU13 driver handling. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-07-12	ring-buffer: Fix deadloop issue on reading trace_pipe	Zheng Yejian
	Soft lockup occurs when reading file 'trace_pipe': watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [cat:4488] [...] RIP: 0010:ring_buffer_empty_cpu+0xed/0x170 RSP: 0018:ffff88810dd6fc48 EFLAGS: 00000246 RAX: 0000000000000000 RBX: 0000000000000246 RCX: ffffffff93d1aaeb RDX: ffff88810a280040 RSI: 0000000000000008 RDI: ffff88811164b218 RBP: ffff88811164b218 R08: 0000000000000000 R09: ffff88815156600f R10: ffffed102a2acc01 R11: 0000000000000001 R12: 0000000051651901 R13: 0000000000000000 R14: ffff888115e49500 R15: 0000000000000000 [...] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f8d853c2000 CR3: 000000010dcd8000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __find_next_entry+0x1a8/0x4b0 ? peek_next_entry+0x250/0x250 ? down_write+0xa5/0x120 ? down_write_killable+0x130/0x130 trace_find_next_entry_inc+0x3b/0x1d0 tracing_read_pipe+0x423/0xae0 ? tracing_splice_read_pipe+0xcb0/0xcb0 vfs_read+0x16b/0x490 ksys_read+0x105/0x210 ? __ia32_sys_pwrite64+0x200/0x200 ? switch_fpu_return+0x108/0x220 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x61/0xc6 Through the vmcore, I found it's because in tracing_read_pipe(), ring_buffer_empty_cpu() found some buffer is not empty but then it cannot read anything due to "rb_num_of_entries() == 0" always true, Then it infinitely loop the procedure due to user buffer not been filled, see following code path: tracing_read_pipe() { ... ... waitagain: tracing_wait_pipe() // 1. find non-empty buffer here trace_find_next_entry_inc() // 2. loop here try to find an entry __find_next_entry() ring_buffer_empty_cpu(); // 3. find non-empty buffer peek_next_entry() // 4. but peek always return NULL ring_buffer_peek() rb_buffer_peek() rb_get_reader_page() // 5. because rb_num_of_entries() == 0 always true here // then return NULL // 6. user buffer not been filled so goto 'waitgain' // and eventually leads to an deadloop in kernel!!! } By some analyzing, I found that when resetting ringbuffer, the 'entries' of its pages are not all cleared (see rb_reset_cpu()). Then when reducing the ringbuffer, and if some reduced pages exist dirty 'entries' data, they will be added into 'cpu_buffer->overrun' (see rb_remove_pages()), which cause wrong 'overrun' count and eventually cause the deadloop issue. To fix it, we need to clear every pages in rb_reset_cpu(). Link: https://lore.kernel.org/linux-trace-kernel/20230708225144.3785600-1-zhengyejian1@huawei.com Cc: stable@vger.kernel.org Fixes: a5fb833172eca ("ring-buffer: Fix uninitialized read_stamp") Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-07-12	drm/amd: Move helper for dynamic speed switch check out of smu13	Mario Limonciello
	This helper is used for checking if the connected host supports the feature, it can be moved into generic code to be used by other smu implementations as well. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-07-12	drm/amd/pm: conditionally disable pcie lane/speed switching for SMU13	Mario Limonciello
	Intel platforms such as Sapphire Rapids and Raptor Lake don't support dynamic pcie lane or speed switching. This limitation seems to carry over from one generation to another. To be safer, disable dynamic pcie lane width and speed switching when running on an Intel platform. Link: https://edc.intel.com/content/www/us/en/design/products/platforms/details/raptor-lake-s/13th-generation-core-processors-datasheet-volume-1-of-2/005/pci-express-support/ Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2663 Co-developed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-07-12	drm/amd/pm: share the code around SMU13 pcie parameters update	Evan Quan
	So that SMU13.0.0 and SMU13.0.7 do not need to have one copy each. Signed-off-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-07-12	drm/amdgpu: avoid restore process run into dead loop.	gaba
	In restore process worker, pinned BO cause update PTE fail, then the function re-schedule the restore_work. This will generate dead loop. Signed-off-by: gaba <gaba@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-07-12	drm/amd/pm: fix smu i2c data read risk	Yang Wang
	the smu driver_table is used for all types of smu tables data transcation (e.g: PPtable, Metrics, i2c, Ecc..). it is necessary to hold this lock to avoiding data tampering during the i2c read operation. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-07-12	tracing: arm64: Avoid missing-prototype warnings	Arnd Bergmann
	These are all tracing W=1 warnings in arm64 allmodconfig about missing prototypes: kernel/trace/trace_kprobe_selftest.c:7:5: error: no previous prototype for 'kprobe_trace_selftest_target' [-Werror=missing-pro totypes] kernel/trace/ftrace.c:329:5: error: no previous prototype for '__register_ftrace_function' [-Werror=missing-prototypes] kernel/trace/ftrace.c:372:5: error: no previous prototype for '__unregister_ftrace_function' [-Werror=missing-prototypes] kernel/trace/ftrace.c:4130:15: error: no previous prototype for 'arch_ftrace_match_adjust' [-Werror=missing-prototypes] kernel/trace/fgraph.c:243:15: error: no previous prototype for 'ftrace_return_to_handler' [-Werror=missing-prototypes] kernel/trace/fgraph.c:358:6: error: no previous prototype for 'ftrace_graph_sleep_time_control' [-Werror=missing-prototypes] arch/arm64/kernel/ftrace.c:460:6: error: no previous prototype for 'prepare_ftrace_return' [-Werror=missing-prototypes] arch/arm64/kernel/ptrace.c:2172:5: error: no previous prototype for 'syscall_trace_enter' [-Werror=missing-prototypes] arch/arm64/kernel/ptrace.c:2195:6: error: no previous prototype for 'syscall_trace_exit' [-Werror=missing-prototypes] Move the declarations to an appropriate header where they can be seen by the caller and callee, and make sure the headers are included where needed. Link: https://lore.kernel.org/linux-trace-kernel/20230517125215.930689-1-arnd@kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Florent Revest <revest@chromium.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Catalin Marinas <catalin.marinas@arm.com> [ Fixed ftrace_return_to_handler() to handle CONFIG_HAVE_FUNCTION_GRAPH_RETVAL case ] Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-07-12	selftests/user_events: Test struct size match cases	Beau Belgrave
	The self tests for user_events currently does not ensure that the edge case for struct types work properly with size differences. Add cases for mis-matching struct names and sizes to ensure they work properly. Link: https://lkml.kernel.org/r/20230629235049.581-3-beaub@linux.microsoft.com Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: linux-kselftest@vger.kernel.org Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-07-12	nvme: fix parameter check in nvme_fault_inject_init()	Minjie Du
	Make IS_ERR() judge the debugfs_create_dir() function return. Signed-off-by: Minjie Du <duminjie@vivo.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2023-07-12	nvme: warn only once for legacy uuid attribute	Keith Busch
	Report the legacy fallback behavior for uuid attributes just once instead of logging repeated warnings for the same condition every time the attribute is read. The old behavior is too spamy on the kernel logs. Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reported-by: Breno Leitao <leitao@debian.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2023-07-12	drm/nouveau/disp/g94: enable HDMI	Karol Herbst
	Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Lyude Paul <lyude@redhat.com> Fixes: f530bc60a30b ("drm/nouveau/disp: move HDMI config into acquire + infoframe methods") Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230630160645.3984596-1-kherbst@redhat.com Signed-off-by: Karol Herbst <kherbst@redhat.com>
2023-07-12	drm/nouveau/disp: fix HDMI on gt215+	Karol Herbst
	Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Lyude Paul <lyude@redhat.com> Fixes: f530bc60a30b ("drm/nouveau/disp: move HDMI config into acquire + infoframe methods") Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230628212248.3798605-1-kherbst@redhat.com Signed-off-by: Karol Herbst <kherbst@redhat.com>
2023-07-12	drm/amd: Move helper for dynamic speed switch check out of smu13	Mario Limonciello
	This helper is used for checking if the connected host supports the feature, it can be moved into generic code to be used by other smu implementations as well. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-12	drm/amdgpu: update kernel vcn ring test	Saleemkhan Jamadar
	add session context buffer to decoder ring test for vcn v1 to v3. v3 - correct the cmd for sesssion ctx buf v2 - add the buffer into IB (Leo liu) Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-12	drm/amdgpu:update kernel vcn ring test	Saleemkhan Jamadar
	add session context buffer to decoder ring test. v5 - clear the session ct buffer (Christian) v4 - data type, explain change of ib size change (Christian) v3 - indent and v2 changes correction. (Christian) v2 - put the buffer at the end of the IB (Christian) Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-12	drm/amd/display: only accept async flips for fast updates	Simon Ser
	Up until now, amdgpu was silently degrading to vsync when user-space requested an async flip but the hardware didn't support it. The hardware doesn't support immediate flips when the update changes the FB pitch, the DCC state, the rotation, enables or disables CRTCs or planes, etc. This is reflected in the dm_crtc_state.update_type field: UPDATE_TYPE_FAST means that immediate flip is supported. Silently degrading async flips to vsync is not the expected behavior from a uAPI point-of-view. Xorg expects async flips to fail if unsupported, to be able to fall back to a blit. i915 already behaves this way. This patch aligns amdgpu with uAPI expectations and returns a failure when an async flip is not possible. Signed-off-by: Simon Ser <contact@emersion.fr> Reviewed-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: André Almeida <andrealmeid@igalia.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-12	drm/amd/pm: conditionally disable pcie lane/speed switching for SMU13	Mario Limonciello
	Intel platforms such as Sapphire Rapids and Raptor Lake don't support dynamic pcie lane or speed switching. This limitation seems to carry over from one generation to another. To be safer, disable dynamic pcie lane width and speed switching when running on an Intel platform. Link: https://edc.intel.com/content/www/us/en/design/products/platforms/details/raptor-lake-s/13th-generation-core-processors-datasheet-volume-1-of-2/005/pci-express-support/ Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2663 Co-developed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-12	drm/amdgpu: add watchdog timer enablement for gfx_v9_4_3	Tao Zhou
	Configure SQ watchdog timer setting. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-12	drm/amdkfd: Update CWSR grace period for GFX9.4.3	Mukul Joshi
	For GFX9.4.3, setup a reduced default CWSR grace period equal to 1000 cycles instead of 64000 cycles. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-12	drm/radeon: ERROR: "(foo)" should be "(foo )"	Ran Sun
	Fix five occurrences of the checkpatch.pl error: ERROR: "(foo)" should be "(foo )" Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-12	drm/radeon: ERROR: that open brace { should be on the previous line	Ran Sun
	Fix eleven occurrences of the checkpatch.pl error: ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-12	drm/radeon: ERROR: "(foo)" should be "(foo )"	Ran Sun
	Fix four occurrences of the checkpatch.pl error: ERROR: "(foo)" should be "(foo )" Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-12	drm/radeon: ERROR: "(foo)" should be "(foo )"	Ran Sun
	Fix four occurrences of the checkpatch.pl error: ERROR: "(foo)" should be "(foo )" Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-12	drm/radeon: ERROR: "foo * bar" should be "foo *bar"	Ran Sun
	Fix nine occurrences of the checkpatch.pl error: ERROR: "foo * bar" should be "foo *bar" Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-12	drm/amdgpu: use psp_execute_load_ip_fw instead	Lang Yu
	Replace the old ones with psp_execute_load_ip_fw. Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>