linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2022-11-07	net: stmmac: dwmac-meson8b: fix meson8b_devm_clk_prepare_enable()	Rasmus Villemoes
	There are two problems with meson8b_devm_clk_prepare_enable(), introduced in commit a54dc4a49045 ("net: stmmac: dwmac-meson8b: Make the clock enabling code re-usable"): - It doesn't pass the clk argument, but instead always the rgmii_tx_clk of the device. - It silently ignores the return value of devm_add_action_or_reset(). The former didn't become an actual bug until another user showed up in the next commit 9308c47640d5 ("net: stmmac: dwmac-meson8b: add support for the RX delay configuration"). The latter means the callers could end up with the clock not actually prepared/enabled. Fixes: a54dc4a49045 ("net: stmmac: dwmac-meson8b: Make the clock enabling code re-usable") Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Link: https://lore.kernel.org/r/20221104083004.2212520-1-linux@rasmusvillemoes.dk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-07	samples/bpf: Fix sockex3 error: Missing BPF prog type	Rong Tao
	since commit 450b167fb9be("libbpf: clean up SEC() handling"), sec_def_matches() does not recognize "socket/xxx" as "socket", therefore, the BPF program type is not recognized. Instead of sockex3_user.c parsing section names to get the BPF program fd. We use the program array map to assign a static index to each BPF program (get inspired by selftests/bpf progs/test_prog_array_init.c). Therefore, use SEC("socket") as section name instead of SEC("socket/xxx"), so that the BPF program is parsed to SOCKET_FILTER type. The "missing BPF prog type" problem is solved. How to reproduce this error: $ cd samples/bpf $ sudo ./sockex3 libbpf: prog 'bpf_func_PARSE_IP': missing BPF prog type, check ELF section name 'socket/3' libbpf: prog 'bpf_func_PARSE_IP': failed to load: -22 libbpf: failed to load object './sockex3_kern.o' ERROR: loading BPF object file failed Signed-off-by: Rong Tao <rongtao@cestc.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/tencent_EBA3C18864069E42175946973C2ACBAF5408@qq.com
2022-11-07	selftests/bpf: Fix u32 variable compared with less than zero	Kang Minchul
	Variable ret is compared with less than zero even though it was set as u32. So u32 to int conversion is needed. Signed-off-by: Kang Minchul <tegongkang@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/r/20221105183656.86077-1-tegongkang@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-11-07	bpf: Add explicit cast to 'void *' for __BPF_DISPATCHER_UPDATE()	Nathan Chancellor
	When building with clang: kernel/bpf/dispatcher.c:126:33: error: pointer type mismatch ('void ' and 'unsigned int ()(const void , const struct bpf_insn , bpf_func_t)' (aka 'unsigned int ()(const void , const struct bpf_insn , unsigned int ()(const void , const struct bpf_insn ))')) [-Werror,-Wpointer-type-mismatch] __BPF_DISPATCHER_UPDATE(d, new ?: &bpf_dispatcher_nop_func); ~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~ ./include/linux/bpf.h:1045:54: note: expanded from macro '__BPF_DISPATCHER_UPDATE' __static_call_update((_d)->sc_key, (_d)->sc_tramp, (_new)) ^~~~ 1 error generated. The warning is pointing out that the type of new ('void ') and &bpf_dispatcher_nop_func are not compatible, which could have side effects coming out of a conditional operator due to promotion rules. Add the explicit cast to 'void ' to make it clear that this is expected, as __BPF_DISPATCHER_UPDATE() expands to a call to __static_call_update(), which expects a 'void *' as its final argument. Fixes: c86df29d11df ("bpf: Convert BPF_DISPATCHER to use static_call() (not ftrace)") Link: https://github.com/ClangBuiltLinux/linux/issues/1755 Reported-by: kernel test robot <lkp@intel.com> Reported-by: "kernelci.org bot" <bot@kernelci.org> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Björn Töpel <bjorn@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20221107170711.42409-1-nathan@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-11-07	fs/userfaultfd: Fix maple tree iterator in userfaultfd_unregister()	Liam Howlett
	When iterating the VMAs, the maple state needs to be invalidated if the tree is modified by a split or merge to ensure the maple tree node contained in the maple state is still valid. These invalidations were missed, so add them to the paths which alter the tree. Reported-by: syzbot+0d2014e4da2ccced5b41@syzkaller.appspotmail.com Fixes: 69dbe6daf104 (userfaultfd: use maple tree iterator to iterate VMAs) Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-11-07	scsi: ibmvfc: Avoid path failures during live migration	Brian King
	Fix an issue reported when performing a live migration when multipath is configured with a short fast fail timeout of 5 seconds and also to have no_path_retry set to fail. In this scenario, all paths would go into the devloss state while the ibmvfc driver went through discovery to log back in. On a loaded system, the discovery might take longer than 5 seconds, which was resulting in all paths being marked failed, which then resulted in a read only filesystem. This patch changes the migration code in ibmvfc to avoid deleting rports at all in this scenario, so we avoid losing all paths. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Link: https://lore.kernel.org/r/20221026181356.148517-1-brking@linux.vnet.ibm.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2022-11-07	Merge tag 'platform-drivers-x86-v6.1-3' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Hans de Goede: "The most important fixes here are a set of fixes for the ACPI backlight detection refactor which landed in 6.1. These fix regressions reported on some laptop models by making acpi_video_backlight_use_native() always return true for now, which in essence undoes some of the changes. I plan to take another shot at having only 1 /sys/class/backlight class device per panel with 6.2, with modified detection heuristics to avoid the (known) regressions. Highlights: - ACPI: video: Fix regressions from 6.1 backlight refactor by making acpi_video_backlight_use_native() always return true for now - Misc other bugfixes and HW id additions" * tag 'platform-drivers-x86-v6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: p2sb: Don't fail if unknown CPU is found platform/x86/intel/hid: Add some ACPI device IDs platform/x86/intel/pmt: Sapphire Rapids PMT errata fix platform/x86: hp_wmi: Fix rfkill causing soft blocked wifi platform/x86: touchscreen_dmi: Add info for the RCA Cambio W101 v2 2-in-1 platform/x86: ideapad-laptop: Disable touchpad_switch ACPI: video: Add backlight=native DMI quirk for Dell G15 5515 ACPI: video: Make acpi_video_backlight_use_native() always return true ACPI: video: Improve Chromebook checks
2022-11-07	mm, slab: remove duplicate kernel-doc comment for ksize()	Vlastimil Babka
	Akira reports: > "make htmldocs" reports duplicate C declaration of ksize() as follows: > /linux/Documentation/core-api/mm-api:43: ./mm/slab_common.c:1428: WARNING: Duplicate C declaration, also defined at core-api/mm-api:212. > Declaration is '.. c:function:: size_t ksize (const void *objp)'. > This is due to the kernel-doc comment for ksize() declaration added in > include/linux/slab.h by commit 05a940656e1e ("slab: Introduce > kmalloc_size_roundup()"). There is an older kernel-doc comment for ksize() definition in mm/slab_common.c, which is not only duplicated, but also contradicts the new one - the additional storage discovered by ksize() should not be used by callers anymore. Delete the old kernel-doc. Reported-by: Akira Yokosawa <akiyks@gmail.com> Link: https://lore.kernel.org/all/d33440f6-40cf-9747-3340-e54ffaf7afb8@gmail.com/ Fixes: 05a940656e1e ("slab: Introduce kmalloc_size_roundup()") Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2022-11-07	mtd: onenand: omap2: add dependency on GPMC	Krzysztof Kozlowski
	OMAP2 OneNAND driver uses gpmc_omap_onenand_set_timings() provided by OMAP_GPMC driver, so the latter cannot be module if OneNAND driver is built-in: /usr/bin/arm-linux-gnueabi-ld: drivers/mtd/nand/onenand/onenand_omap2.o: in function `omap2_onenand_probe': onenand_omap2.c:(.text+0x520): undefined reference to `gpmc_omap_onenand_set_timings' The OMAP_GPMC is also a runtime dependency. Reported-by: kernel test robot <lkp@intel.com> Fixes: 854fd9209b20 ("memory: omap-gpmc: Allow building as a module") Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20221107091520.127053-1-krzysztof.kozlowski@linaro.org
2022-11-07	mtd: rawnand: placate "$VARIABLE is used uninitialized" warnings	Adam Borowski
	The compiler is not smart enough to notice that it's impossible for them to be actually used uninitialized. Which exact variables trip here varies depending on random surrounding code; none triggered in 6.1-rc1 but 6.1-rc2 fails on three of these five, despite variables declared in the very same line having identical flow. Signed-off-by: Adam Borowski <kilobyte@angband.pl> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20221024092026.42123-1-kilobyte@angband.pl
2022-11-07	mtd: rawnand: qcom: handle ret from parse with codeword_fixup	Christian Marangi
	With use_codeword_fixup enabled, any return from mtd_device_parse_register gets overwritten. Aside from the clear bug, this is also problematic as a parser can EPROBE_DEFER and because this is not correctly handled, the nand is never rescanned later in the bootup process. An example of this problem is when smem requires additional time to be probed and nandc use qcomsmempart as parser. Parser will return EPROBE_DEFER but in the current code this ret gets overwritten by qcom_nand_host_parse_boot_partitions and qcom_nand_host_init_and_register return 0. Correctly handle the return code from mtd_device_parse_register so that any error from this function is not ignored. Fixes: 862bdedd7f4b ("mtd: nand: raw: qcom_nandc: add support for unprotected spare data pages") Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20221021165304.19991-1-ansuelsmth@gmail.com
2022-11-07	drm/panfrost: Remove type name from internal struct again	Steven Price
	Commit 72655fb942c1 ("drm/panfrost: replace endian-specific types with native ones") accidentally reverted part of the parent commit 7228d9d79248 ("drm/panfrost: Remove type name from internal structs") leading to the situation that the Panfrost UAPI header still doesn't compile correctly in C++. Revert the accidental revert and pass me a brown paper bag. Reported-by: Alyssa Rosenzweig <alyssa@collabora.com> Fixes: 72655fb942c1 ("drm/panfrost: replace endian-specific types with native ones") Signed-off-by: Steven Price <steven.price@arm.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221103114036.1581854-1-steven.price@arm.com
2022-11-07	pinctrl: rockchip: list all pins in a possible mux route for PX30	Quentin Schulz
	The mux routes are incomplete for the PX30. This was discovered because we had a HW design using cif-clkoutm1 with the correct pinmux in the Device Tree but the clock would still not work. There are actually two muxing required: the pin muxing (performed by the usual Device Tree pinctrl nodes) and the "function" muxing (m0 vs m1; performed by the mux routing inside the driver). The pin muxing was correct but the function muxing was not. This adds the missing pins and their configuration for the mux routes that are already specified in the driver. Note that there are some "conflicts": it is possible in Device Tree to (attempt to) mux the pins for e.g. clkoutm1 and clkinm0 at the same time but this is actually not possible in hardware (because both share the same bit for the function muxing). Since it is an impossible hardware design, it is not deemed necessary to prevent the user from attempting to "misconfigure" the pins/functions. Fixes: 87065ca9b8e5 ("pinctrl: rockchip: Add pinctrl support for PX30") Signed-off-by: Quentin Schulz <quentin.schulz@theobroma-systems.com> Link: https://lore.kernel.org/r/20221017-upstream-px30-cif-clkoutm1-v1-0-4ea1389237f7@theobroma-systems.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2022-11-07	btrfs: zoned: fix locking imbalance on scrub	Johannes Thumshirn
	If we're doing device replace on a zoned filesystem and discover in scrub_enumerate_chunks() that we don't have to copy the block group it is unlocked before it gets skipped. But as the block group hasn't yet been locked before it leads to a locking imbalance. To fix this simply remove the unlock. This was uncovered by fstests' testcase btrfs/163. Fixes: 9283b9e09a6d ("btrfs: remove lock protection for BLOCK_GROUP_FLAG_TO_COPY") Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2022-11-07	btrfs: zoned: initialize device's zone info for seeding	Johannes Thumshirn
	When performing seeding on a zoned filesystem it is necessary to initialize each zoned device's btrfs_zoned_device_info structure, otherwise mounting the filesystem will cause a NULL pointer dereference. This was uncovered by fstests' testcase btrfs/163. CC: stable@vger.kernel.org # 5.15+ Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
2022-11-07	btrfs: zoned: clone zoned device info when cloning a device	Johannes Thumshirn
	When cloning a btrfs_device, we're not cloning the associated btrfs_zoned_device_info structure of the device in case of a zoned filesystem. Later on this leads to a NULL pointer dereference when accessing the device's zone_info for instance when setting a zone as active. This was uncovered by fstests' testcase btrfs/161. CC: stable@vger.kernel.org # 5.15+ Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2022-11-07	Revert "btrfs: scrub: use larger block size for data extent scrub"	Qu Wenruo
	This reverts commit 786672e9e1a39a231806313e3c445c236588ceef. [BUG] Since commit 786672e9e1a3 ("btrfs: scrub: use larger block size for data extent scrub"), btrfs scrub no longer reports errors if the corruption is not in the first sector of a STRIPE_LEN. The following script can expose the problem: mkfs.btrfs -f $dev mount $dev $mnt xfs_io -f -c "pwrite -S 0xff 0 8k" $mnt/foobar umount $mnt # 13631488 is the logical bytenr of above 8K extent btrfs-map-logical -l 13631488 -b 4096 $dev mirror 1 logical 13631488 physical 13631488 device /dev/test/scratch1 # Corrupt the 2nd sector of that extent xfs_io -f -c "pwrite -S 0x00 13635584 4k" $dev mount $dev $mnt btrfs scrub start -B $mnt scrub done for 54e63f9f-0c30-4c84-a33b-5c56014629b7 Scrub started: Mon Nov 7 07:18:27 2022 Status: finished Duration: 0:00:00 Total to scrub: 536.00MiB Rate: 0.00B/s Error summary: no errors found <<< [CAUSE] That offending commit enlarges the data extent scrub size from sector size to BTRFS_STRIPE_LEN, to avoid extra scrub_block to be allocated. But unfortunately the data extent scrub is still heavily relying on the fact that there is only one scrub_sector per scrub_block. Thus it will only check the first sector, and ignoring the remaining sectors. Furthermore the error reporting is not able to handle multiple sectors either. [FIX] For now just revert the offending commit. The consequence is just extra memory usage during scrub. We will need a proper change to make the remaining data scrub path to handle multiple sectors before we enlarging the data scrub size. Reported-by: Li Zhang <zhanglikernel@gmail.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2022-11-07	btrfs: don't print stack trace when transaction is aborted due to ENOMEM	David Sterba
	Add ENOMEM among the error codes that don't print stack trace on transaction abort. We've got several reports from syzbot that detects stacks as errors but caused by limiting memory. As this is an artificial condition we don't need to know where exactly the error happens, the abort and error cleanup will continue like e.g. for EIO. As the transaction aborts code needs to be inline in a lot of code, the implementation cases about minimal bloat. The error codes are in a separate function and the WARN uses the condition directly. This increases the code size by 571 bytes on release build. Alternatives considered: add -ENOMEM among the errors, this increases size by 2340 bytes, various attempts to combine the WARN and helper calls, increase by 700 or more bytes. Example syzbot reports (error -12): - https://syzkaller.appspot.com/bug?extid=5244d35be7f589cf093e - https://syzkaller.appspot.com/bug?extid=9c37714c07194d816417 Signed-off-by: David Sterba <dsterba@suse.com>
2022-11-07	btrfs: selftests: fix wrong error check in btrfs_free_dummy_root()	Zhang Xiaoxu
	The btrfs_alloc_dummy_root() uses ERR_PTR as the error return value rather than NULL, if error happened, there will be a NULL pointer dereference: BUG: KASAN: null-ptr-deref in btrfs_free_dummy_root+0x21/0x50 [btrfs] Read of size 8 at addr 000000000000002c by task insmod/258926 CPU: 2 PID: 258926 Comm: insmod Tainted: G W 6.1.0-rc2+ #5 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x34/0x44 kasan_report+0xb7/0x140 kasan_check_range+0x145/0x1a0 btrfs_free_dummy_root+0x21/0x50 [btrfs] btrfs_test_free_space_cache+0x1a8c/0x1add [btrfs] btrfs_run_sanity_tests+0x65/0x80 [btrfs] init_btrfs_fs+0xec/0x154 [btrfs] do_one_initcall+0x87/0x2a0 do_init_module+0xdf/0x320 load_module+0x3006/0x3390 __do_sys_finit_module+0x113/0x1b0 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 Fixes: aaedb55bc08f ("Btrfs: add tests for btrfs_get_extent") CC: stable@vger.kernel.org # 4.9+ Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2022-11-07	btrfs: fix match incorrectly in dev_args_match_device	Liu Shixin
	syzkaller found a failed assertion: assertion failed: (args->devid != (u64)-1) \|\| args->missing, in fs/btrfs/volumes.c:6921 This can be triggered when we set devid to (u64)-1 by ioctl. In this case, the match of devid will be skipped and the match of device may succeed incorrectly. Patch 562d7b1512f7 introduced this function which is used to match device. This function contains two matching scenarios, we can distinguish them by checking the value of args->missing rather than check whether args->devid and args->uuid is default value. Reported-by: syzbot+031687116258450f9853@syzkaller.appspotmail.com Fixes: 562d7b1512f7 ("btrfs: handle device lookup with btrfs_dev_lookup_args") CC: stable@vger.kernel.org # 5.16+ Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Liu Shixin <liushixin2@huawei.com> Signed-off-by: David Sterba <dsterba@suse.com>
2022-11-07	drm/i915/userptr: restore probe_range behaviour	Matthew Auld
	The conversion looks harmless, however the addr value is updated inside the loop with the previous vm_end, which then incorrectly leads to for_each_vma_range() iterating over stuff outside the range we care about. Fix this by storing the end value separately. Also fix the case where the range doesn't intersect with any vma, or if the vma itself doesn't extend the entire range, which must mean we have hole at the end. Both should result in an error, as per the previous behaviour. v2: Fix the cases where the range is empty, or if there's a hole at the end of the range Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/7247 Testcase: igt@gem_userptr_blits@probe Fixes: f683b9d61319 ("i915: use the VMA iterator") Reported-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Yu Zhao <yuzhao@google.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221028130635.465839-1-matthew.auld@intel.com (cherry picked from commit 6f7de35b50860c345babf8ed0aa0d75f9315eee4) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-11-07	drm/i915: Do not set cache_dirty for DGFX	Niranjana Vishwanathapura
	Currently on DG1, which does not have LLC, we hit the below warning while rebinding an userptr invalidated object. WARNING: CPU: 4 PID: 13008 at drivers/gpu/drm/i915/gem/i915_gem_pages.c:34 __i915_gem_object_set_pages+0x296/0x2d0 [i915] ... RIP: 0010:__i915_gem_object_set_pages+0x296/0x2d0 [i915] ... Call Trace: <TASK> i915_gem_userptr_get_pages+0x175/0x1a0 [i915] ____i915_gem_object_get_pages+0x32/0xb0 [i915] i915_gem_object_userptr_submit_init+0x286/0x470 [i915] eb_lookup_vmas+0x2ff/0xcf0 [i915] ? __intel_wakeref_get_first+0x55/0xb0 [i915] i915_gem_do_execbuffer+0x785/0x21d0 [i915] i915_gem_execbuffer2_ioctl+0xe7/0x3d0 [i915] We shouldn't be setting the obj->cache_dirty for DGFX, fix it. Fixes: d70af57944a1 ("drm/i915/shmem: ensure flush during swap-in on non-LLC") Suggested-by: Matthew Auld <matthew.auld@intel.com> Reported-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221102051416.27327-1-niranjana.vishwanathapura@intel.com (cherry picked from commit 0aeec60c76ca2631696b4228f3fc99fe3a80013d) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-11-07	drm/i915/psr: Send update also on invalidate	Jouni Högander
	Currently we are observing mouse cursor stuttering when using xrandr --scaling=1.2x1.2. X scaling/transformation seems to be doing fronbuffer rendering. When moving mouse cursor X seems to perform several invalidates and only one DirtyFB. I.e. it seems to be assuming updates are sent to panel while drawing is done. Earlier we were disabling PSR in frontbuffer invalidate call back (when drawing in X started). PSR was re-enabled in frontbuffer flush callback (dirtyfb ioctl). This was working fine with X scaling/transformation. Now we are just enabling continuous full frame (cff) in PSR invalidate callback. Enabling cff doesn't trigger any updates. It just configures PSR to send full frame when updates are sent. I.e. there are no updates on screen before PSR flush callback is made. X seems to be doing several updates in frontbuffer before doing dirtyfb ioctl. Fix this by sending single update on every invalidate callback. Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Mika Kahola <mika.kahola@intel.com> Fixes: 805f04d42a6b ("drm/i915/display/psr: Use continuos full frame to handle frontbuffer invalidations") Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6679 Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reported-by: Brian J. Tarricone <brian@tarricone.org> Tested-by: Brian J. Tarricone <brian@tarricone.org> Reviewed-by: Mika Kahola <mika.kahola@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221024054649.31299-1-jouni.hogander@intel.com (cherry picked from commit d755f89220a2b49bc90b7b520bb6edeb4adb5f01) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-11-07	drm/i915/dmabuf: fix sg_table handling in map_dma_buf	Matthew Auld
	We need to iterate over the original entries here for the sg_table, pulling out the struct page for each one, to be remapped. However currently this incorrectly iterates over the final dma mapped entries, which is likely just one gigantic sg entry if the iommu is enabled, leading to us only mapping the first struct page (and any physically contiguous pages following it), even if there is potentially lots more data to follow. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/7306 Fixes: 1286ff739773 ("i915: add dmabuf/prime buffer sharing support.") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Michael J. Ruhl <michael.j.ruhl@intel.com> Cc: <stable@vger.kernel.org> # v3.5+ Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221028155029.494736-1-matthew.auld@intel.com (cherry picked from commit 28d52f99bbca7227008cf580c9194c9b3516968e) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-11-07	can: rcar_canfd: Add missing ECC error checks for channels 2-7	Geert Uytterhoeven
	When introducing support for R-Car V3U, which has 8 instead of 2 channels, the ECC error bitmask was extended to take into account the extra channels, but rcar_canfd_global_error() was not updated to act upon the extra bits. Replace the RCANFD_GERFL_EEF[01] macros by a new macro that takes the channel number, fixing R-Car V3U while simplifying the code. Fixes: 45721c406dcf50d4 ("can: rcar_canfd: Add support for r8a779a0 SoC") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Biju Das <biju.das.jz@bp.renesas.com> Link: https://lore.kernel.org/all/4edb2ea46cc64d0532a08a924179827481e14b4f.1666951503.git.geert+renesas@glider.be Cc: stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-11-07	can: dev: fix skb drop check	Oliver Hartkopp
	In commit a6d190f8c767 ("can: skb: drop tx skb if in listen only mode") the priv->ctrlmode element is read even on virtual CAN interfaces that do not create the struct can_priv at startup. This out-of-bounds read may lead to CAN frame drops for virtual CAN interfaces like vcan and vxcan. This patch mainly reverts the original commit and adds a new helper for CAN interface drivers that provide the required information in struct can_priv. Fixes: a6d190f8c767 ("can: skb: drop tx skb if in listen only mode") Reported-by: Dariusz Stojaczyk <Dariusz.Stojaczyk@opensynergy.com> Cc: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Cc: Max Staudt <max@enpas.org> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Acked-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://lore.kernel.org/all/20221102095431.36831-1-socketcan@hartkopp.net Cc: stable@vger.kernel.org # 6.0.x [mkl: patch pch_can, too] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-11-07	can: j1939: j1939_send_one(): fix missing CAN header initialization	Oliver Hartkopp
	The read access to struct canxl_frame::len inside of a j1939 created skbuff revealed a missing initialization of reserved and later filled elements in struct can_frame. This patch initializes the 8 byte CAN header with zero. Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol") Cc: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/linux-can/20221104052235.GA6474@pengutronix.de Reported-by: syzbot+d168ec0caca4697e03b1@syzkaller.appspotmail.com Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://lore.kernel.org/all/20221104075000.105414-1-socketcan@hartkopp.net Cc: stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-11-07	can: isotp: fix tx state handling for echo tx processing	Oliver Hartkopp
	In commit 4b7fe92c0690 ("can: isotp: add local echo tx processing for consecutive frames") the data flow for consecutive frames (CF) has been reworked to improve the reliability of long data transfers. This rework did not touch the transmission and the tx state changes of single frame (SF) transfers which likely led to the WARN in the isotp_tx_timer_handler() catching a wrong tx state. This patch makes use of the improved frame processing for SF frames and sets the ISOTP_SENDING state in isotp_sendmsg() within the cmpxchg() condition handling. A review of the state machine and the timer handling additionally revealed a missing echo timeout handling in the case of the burst mode in isotp_rcv_echo() and removes a potential timer configuration uncertainty in isotp_rcv_fc() when the receiver requests consecutive frames. Fixes: 4b7fe92c0690 ("can: isotp: add local echo tx processing for consecutive frames") Link: https://lore.kernel.org/linux-can/CAO4mrfe3dG7cMP1V5FLUkw7s+50c9vichigUMQwsxX4M=45QEw@mail.gmail.com/T/#u Reported-by: Wei Chen <harperchen1110@gmail.com> Cc: stable@vger.kernel.org # v6.0 Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://lore.kernel.org/all/20221104142551.16924-1-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-11-07	can: af_can: fix NULL pointer dereference in can_rx_register()	Zhengchao Shao
	It causes NULL pointer dereference when testing as following: (a) use syscall(__NR_socket, 0x10ul, 3ul, 0) to create netlink socket. (b) use syscall(__NR_sendmsg, ...) to create bond link device and vxcan link device, and bind vxcan device to bond device (can also use ifenslave command to bind vxcan device to bond device). (c) use syscall(__NR_socket, 0x1dul, 3ul, 1) to create CAN socket. (d) use syscall(__NR_bind, ...) to bind the bond device to CAN socket. The bond device invokes the can-raw protocol registration interface to receive CAN packets. However, ml_priv is not allocated to the dev, dev_rcv_lists is assigned to NULL in can_rx_register(). In this case, it will occur the NULL pointer dereference issue. The following is the stack information: BUG: kernel NULL pointer dereference, address: 0000000000000008 PGD 122a4067 P4D 122a4067 PUD 1223c067 PMD 0 Oops: 0000 [#1] PREEMPT SMP RIP: 0010:can_rx_register+0x12d/0x1e0 Call Trace: <TASK> raw_enable_filters+0x8d/0x120 raw_enable_allfilters+0x3b/0x130 raw_bind+0x118/0x4f0 __sys_bind+0x163/0x1a0 __x64_sys_bind+0x1e/0x30 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd </TASK> Fixes: 4e096a18867a ("net: introduce CAN specific pointer in the struct net_device") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Marc Kleine-Budde <mkl@pengutronix.de> Link: https://lore.kernel.org/all/20221028085650.170470-1-shaozhengchao@huawei.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-11-07	can: af_can: can_exit(): add missing dev_remove_pack() of canxl_packet	Chen Zhongjin
	In can_init(), dev_add_pack(&canxl_packet) is added but not removed in can_exit(). It breaks the packet handler list and can make kernel panic when can_init() is called for the second time. \| > modprobe can && rmmod can \| > rmmod xxx && modprobe can \| \| BUG: unable to handle page fault for address: fffffbfff807d7f4 \| RIP: 0010:dev_add_pack+0x133/0x1f0 \| Call Trace: \| <TASK> \| can_init+0xaa/0x1000 [can] \| do_one_initcall+0xd3/0x4e0 \| ... Fixes: fb08cba12b52 ("can: canxl: update CAN infrastructure for CAN XL frames") Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://lore.kernel.org/all/20221031033053.37849-1-chenzhongjin@huawei.com [mkl: adjust subject and commit message] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-11-07	mmc: sdhci_am654: Fix SDHCI_RESET_ALL for CQHCI	Brian Norris
	[[ NOTE: this is completely untested by the author, but included solely because, as noted in commit df57d73276b8 ("mmc: sdhci-pci: Fix SDHCI_RESET_ALL for CQHCI for Intel GLK-based controllers"), "other drivers using CQHCI might benefit from a similar change, if they also have CQHCI reset by SDHCI_RESET_ALL." We've now seen the same bug on at least MSM, Arasan, and Intel hardware. ]] SDHCI_RESET_ALL resets will reset the hardware CQE state, but we aren't tracking that properly in software. When out of sync, we may trigger various timeouts. It's not typical to perform resets while CQE is enabled, but this may occur in some suspend or error recovery scenarios. Include this fix by way of the new sdhci_and_cqhci_reset() helper. This patch depends on (and should not compile without) the patch entitled "mmc: cqhci: Provide helper for resetting both SDHCI and CQHCI". Fixes: f545702b74f9 ("mmc: sdhci_am654: Add Support for Command Queuing Engine to J721E") Signed-off-by: Brian Norris <briannorris@chromium.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221026124150.v4.6.I35ca9d6220ba48304438b992a76647ca8e5b126f@changeid Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-11-07	mmc: sdhci-tegra: Fix SDHCI_RESET_ALL for CQHCI	Brian Norris
	[[ NOTE: this is completely untested by the author, but included solely because, as noted in commit df57d73276b8 ("mmc: sdhci-pci: Fix SDHCI_RESET_ALL for CQHCI for Intel GLK-based controllers"), "other drivers using CQHCI might benefit from a similar change, if they also have CQHCI reset by SDHCI_RESET_ALL." We've now seen the same bug on at least MSM, Arasan, and Intel hardware. ]] SDHCI_RESET_ALL resets will reset the hardware CQE state, but we aren't tracking that properly in software. When out of sync, we may trigger various timeouts. It's not typical to perform resets while CQE is enabled, but this may occur in some suspend or error recovery scenarios. Include this fix by way of the new sdhci_and_cqhci_reset() helper. This patch depends on (and should not compile without) the patch entitled "mmc: cqhci: Provide helper for resetting both SDHCI and CQHCI". Fixes: 3c4019f97978 ("mmc: tegra: HW Command Queue Support for Tegra SDMMC") Signed-off-by: Brian Norris <briannorris@chromium.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221026124150.v4.5.I418c9eaaf754880fcd2698113e8c3ef821a944d7@changeid Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-11-07	mms: sdhci-esdhc-imx: Fix SDHCI_RESET_ALL for CQHCI	Brian Norris
	[[ NOTE: this is completely untested by the author, but included solely because, as noted in commit df57d73276b8 ("mmc: sdhci-pci: Fix SDHCI_RESET_ALL for CQHCI for Intel GLK-based controllers"), "other drivers using CQHCI might benefit from a similar change, if they also have CQHCI reset by SDHCI_RESET_ALL." We've now seen the same bug on at least MSM, Arasan, and Intel hardware. ]] SDHCI_RESET_ALL resets will reset the hardware CQE state, but we aren't tracking that properly in software. When out of sync, we may trigger various timeouts. It's not typical to perform resets while CQE is enabled, but this may occur in some suspend or error recovery scenarios. Include this fix by way of the new sdhci_and_cqhci_reset() helper. This patch depends on (and should not compile without) the patch entitled "mmc: cqhci: Provide helper for resetting both SDHCI and CQHCI". Fixes: bb6e358169bf ("mmc: sdhci-esdhc-imx: add CMDQ support") Signed-off-by: Brian Norris <briannorris@chromium.org> Reviewed-by: Haibo Chen <haibo.chen@nxp.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221026124150.v4.4.I7d01f9ad11bacdc9213dee61b7918982aea39115@changeid Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-11-07	mmc: sdhci-brcmstb: Fix SDHCI_RESET_ALL for CQHCI	Brian Norris
	[[ NOTE: this is completely untested by the author, but included solely because, as noted in commit df57d73276b8 ("mmc: sdhci-pci: Fix SDHCI_RESET_ALL for CQHCI for Intel GLK-based controllers"), "other drivers using CQHCI might benefit from a similar change, if they also have CQHCI reset by SDHCI_RESET_ALL." We've now seen the same bug on at least MSM, Arasan, and Intel hardware. ]] SDHCI_RESET_ALL resets will reset the hardware CQE state, but we aren't tracking that properly in software. When out of sync, we may trigger various timeouts. It's not typical to perform resets while CQE is enabled, but this may occur in some suspend or error recovery scenarios. Include this fix by way of the new sdhci_and_cqhci_reset() helper. I only patch the bcm7216 variant even though others potentially could provide the 'supports-cqe' property (and thus enable CQHCI), because d46ba2d17f90 ("mmc: sdhci-brcmstb: Add support for Command Queuing (CQE)") and some Broadcom folks confirm that only the 7216 variant actually supports it. This patch depends on (and should not compile without) the patch entitled "mmc: cqhci: Provide helper for resetting both SDHCI and CQHCI". Fixes: d46ba2d17f90 ("mmc: sdhci-brcmstb: Add support for Command Queuing (CQE)") Signed-off-by: Brian Norris <briannorris@chromium.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221026124150.v4.3.I6a715feab6d01f760455865e968ecf0d85036018@changeid Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-11-07	Merge branch 'genetlink-per-op-type-policies'	David S. Miller
	Jakub Kicinski says: ==================== genetlink: support per op type policies While writing new genetlink families I was increasingly annoyed by the fact that we don't support different policies for do and dump callbacks. This makes it hard to do proper input validation for dumps which usually have a lot more narrow range of accepted attributes. There is also a minor inconvenience of not supporting different per_doit and post_doit callbacks per op. This series addresses those problems by introducing another op format. v3: - minor fixes to patch 12 after I took it for a spin with a real family - adjust commit msg in patch 8 v2: https://lore.kernel.org/all/20221102213338.194672-1-kuba@kernel.org/ - wait for net changes to propagate - restore the missing comment in patch 1 - drop extra space in patch 3 - improve commit message in patch 4 v1: https://lore.kernel.org/all/20221018230728.1039524-1-kuba@kernel.org/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-07	genetlink: convert control family to split ops	Jakub Kicinski
	Prove that the split ops work. Sadly we need to keep bug-wards compatibility and specify the same policy for dump as do, even tho we don't parse inputs for the dump. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-07	genetlink: allow families to use split ops directly	Jakub Kicinski
	Let families to hook in the new split ops. They are more flexible and should not be much larger than full ops. Each split op is 40B while full op is 48B. Devlink for example has 54 dos and 19 dumps, 2 of the dumps do not have a do -> 56 full commands = 2688B. Split ops would have taken 2920B, so 9% more space while allowing individual per/post doit and per-type policies. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-07	genetlink: inline old iteration helpers	Jakub Kicinski
	All dumpers use the iterators now, inline the cmd by index stuff into iterator code. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-07	genetlink: use iterator in the op to policy map dumping	Jakub Kicinski
	We can't put the full iterator in the struct ctrl_dump_policy_ctx because dump context is statically sized by netlink core. Allocate it dynamically. Rename policy to dump_map to make the logic a little easier to follow. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-07	genetlink: add iterator for walking family ops	Jakub Kicinski
	Subsequent changes will expose split op structures to users, so walking the family ops with just an index will get harder. Add a structured iterator, convert the simple cases. Policy dumping needs more careful conversion. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-07	genetlink: inline genl_get_cmd()	Jakub Kicinski
	All callers go via genl_get_cmd_split() now, so rename it to genl_get_cmd() remove the original. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-07	genetlink: support split policies in ctrl_dumppolicy_put_op()	Jakub Kicinski
	Pass do and dump versions of the op to ctrl_dumppolicy_put_op() so that it can provide a different policy index for the two. Since we now look at policies, and those are set appropriately there's no need to look at the GENL_DONT_VALIDATE_DUMP flag. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-07	genetlink: add policies for both doit and dumpit in ctrl_dumppolicy_start()	Jakub Kicinski
	Separate adding doit and dumpit policies for CTRL_CMD_GETPOLICY. This has no effect until we actually allow do and dump to come from different sources as netlink_policy_dump_add_policy() does deduplication. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-07	genetlink: check for callback type at op load time	Jakub Kicinski
	Now that genl_get_cmd_split() is informed what type of callback user is trying to access (do or dump) we can check that this callback is indeed available and return an error early. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-07	genetlink: load policy based on validation flags	Jakub Kicinski
	Set the policy and maxattr pointers based on validation flags. genl_family_rcv_msg_attrs_parse() will do nothing and return NULL if maxattrs is zero, so no behavior change is expected. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-07	genetlink: introduce split op representation	Jakub Kicinski
	We currently have two forms of operations - small ops and "full" ops (or just ops). The former does not have pointers for some of the less commonly used features (namely dump start/done and policy). The "full" ops, however, still don't contain all the necessary information. In particular the policy is per command ID, while do and dump often accept different attributes. It's also not possible to define different pre_doit and post_doit callbacks for different commands within the family. At the same time a lot of commands do not support dumping and therefore all the dump-related information is wasted space. Create a new command representation which can hold info about a do implementation or a dump implementation, but not both at the same time. Use this new representation on the command execution path (genl_family_rcv_msg) as we either run a do or a dump and don't have to create a "full" op there. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-07	genetlink: move the private fields in struct genl_family	Jakub Kicinski
	Move the private fields down to form a "private section". Use the kdoc "private:" label comment thing to hide them from the main kdoc comment. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-07	genetlink: refactor the cmd <> policy mapping dump	Jakub Kicinski
	The code at the top of ctrl_dumppolicy() dumps mappings between ops and policies. It supports dumping both the entire family and single op if dump is filtered. But both of those cases are handled inside a loop, which makes the logic harder to follow and change. Refactor to split the two cases more clearly. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-07	mmc: sdhci-of-arasan: Fix SDHCI_RESET_ALL for CQHCI	Brian Norris
	SDHCI_RESET_ALL resets will reset the hardware CQE state, but we aren't tracking that properly in software. When out of sync, we may trigger various timeouts. It's not typical to perform resets while CQE is enabled, but one particular case I hit commonly enough: mmc_suspend() -> mmc_power_off(). Typically we will eventually deactivate CQE (cqhci_suspend() -> cqhci_deactivate()), but that's not guaranteed -- in particular, if we perform a partial (e.g., interrupted) system suspend. The same bug was already found and fixed for two other drivers, in v5.7 and v5.9: 5cf583f1fb9c ("mmc: sdhci-msm: Deactivate CQE during SDHC reset") df57d73276b8 ("mmc: sdhci-pci: Fix SDHCI_RESET_ALL for CQHCI for Intel GLK-based controllers") The latter is especially prescient, saying "other drivers using CQHCI might benefit from a similar change, if they also have CQHCI reset by SDHCI_RESET_ALL." So like these other patches, deactivate CQHCI when resetting the controller. Do this via the new sdhci_and_cqhci_reset() helper. This patch depends on (and should not compile without) the patch entitled "mmc: cqhci: Provide helper for resetting both SDHCI and CQHCI". Fixes: 84362d79f436 ("mmc: sdhci-of-arasan: Add CQHCI support for arasan,sdhci-5.1") Cc: <stable@vger.kernel.org> Signed-off-by: Brian Norris <briannorris@chromium.org> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20221026124150.v4.2.I29f6a2189e84e35ad89c1833793dca9e36c64297@changeid Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-11-07	mmc: cqhci: Provide helper for resetting both SDHCI and CQHCI	Brian Norris
	Several SDHCI drivers need to deactivate command queueing in their reset hook (see sdhci_cqhci_reset() / sdhci-pci-core.c, for example), and several more are coming. Those reset implementations have some small subtleties (e.g., ordering of initialization of SDHCI vs. CQHCI might leave us resetting with a NULL ->cqe_private), and are often identical across different host drivers. We also don't want to force a dependency between SDHCI and CQHCI, or vice versa; non-SDHCI drivers use CQHCI, and SDHCI drivers might support command queueing through some other means. So, implement a small helper, to avoid repeating the same mistakes in different drivers. Simply stick it in a header, because it's so small it doesn't deserve its own module right now, and inlining to each driver is pretty reasonable. This is marked for -stable, as it is an important prerequisite patch for several SDHCI controller bugfixes that follow. Cc: <stable@vger.kernel.org> Signed-off-by: Brian Norris <briannorris@chromium.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20221026124150.v4.1.Ie85faa09432bfe1b0890d8c24ff95e17f3097317@changeid Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>