summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-11-16KVM: selftests: Fill per-vCPU struct during "perf_test" VM creationSean Christopherson
Fill the per-vCPU args when creating the perf_test VM instead of having the caller do so. This helps ensure that any adjustments to the number of pages (and thus vcpu_memory_bytes) are reflected in the per-VM args. Automatically filling the per-vCPU args will also allow a future patch to do the sync to the guest during creation. Signed-off-by: Sean Christopherson <seanjc@google.com> [Updated access_tracking_perf_test as well.] Signed-off-by: David Matlack <dmatlack@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Message-Id: <20211111000310.1435032-12-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16KVM: selftests: Create VM with adjusted number of guest pages for perf testsSean Christopherson
Use the already computed guest_num_pages when creating the so called extra VM pages for a perf test, and add a comment explaining why the pages are allocated as extra pages. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20211111000310.1435032-11-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16KVM: selftests: Remove perf_test_args.host_page_sizeSean Christopherson
Remove perf_test_args.host_page_size and instead use getpagesize() so that it's somewhat obvious that, for tests that care about the host page size, they care about the system page size, not the hardware page size, e.g. that the logic is unchanged if hugepages are in play. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20211111000310.1435032-10-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16KVM: selftests: Move per-VM GPA into perf_test_argsSean Christopherson
Move the per-VM GPA into perf_test_args instead of storing it as a separate global variable. It's not obvious that guest_test_phys_mem holds a GPA, nor that it's connected/coupled with per_vcpu->gpa. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20211111000310.1435032-9-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16KVM: selftests: Use perf util's per-vCPU GPA/pages in demand paging testSean Christopherson
Grab the per-vCPU GPA and number of pages from perf_util in the demand paging test instead of duplicating perf_util's calculations. Note, this may or may not result in a functional change. It's not clear that the test's calculations are guaranteed to yield the same value as perf_util, e.g. if guest_percpu_mem_size != vcpu_args->pages. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20211111000310.1435032-8-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16KVM: selftests: Capture per-vCPU GPA in perf_test_vcpu_argsSean Christopherson
Capture the per-vCPU GPA in perf_test_vcpu_args so that tests can get the GPA without having to calculate the GPA on their own. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20211111000310.1435032-7-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16KVM: selftests: Use shorthand local var to access struct perf_tests_argsSean Christopherson
Use 'pta' as a local pointer to the global perf_tests_args in order to shorten line lengths and make the code borderline readable. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20211111000310.1435032-6-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16KVM: selftests: Require GPA to be aligned when backed by hugepagesSean Christopherson
Assert that the GPA for a memslot backed by a hugepage is aligned to the hugepage size and fix perf_test_util accordingly. Lack of GPA alignment prevents KVM from backing the guest with hugepages, e.g. x86's write-protection of hugepages when dirty logging is activated is otherwise not exercised. Add a comment explaining that guest_page_size is for non-huge pages to try and avoid confusion about what it actually tracks. Cc: Ben Gardon <bgardon@google.com> Cc: Yanan Wang <wangyanan55@huawei.com> Cc: Andrew Jones <drjones@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: Aaron Lewis <aaronlewis@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> [Used get_backing_src_pagesz() to determine alignment dynamically.] Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20211111000310.1435032-5-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16KVM: selftests: Assert mmap HVA is aligned when using HugeTLBSean Christopherson
Manually padding and aligning the mmap region is only needed when using THP. When using HugeTLB, mmap will always return an address aligned to the HugeTLB page size. Add a comment to clarify this and assert the mmap behavior for HugeTLB. [Removed requirement that HugeTLB mmaps must be padded per Yanan's feedback and added assertion that mmap returns aligned addresses when using HugeTLB.] Cc: Ben Gardon <bgardon@google.com> Cc: Yanan Wang <wangyanan55@huawei.com> Cc: Andrew Jones <drjones@redhat.com> Cc: Peter Xu <peterx@redhat.com> Cc: Aaron Lewis <aaronlewis@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20211111000310.1435032-4-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16KVM: selftests: Expose align() helpers to testsSean Christopherson
Refactor align() to work with non-pointers and split into separate helpers for aligning up vs. down. Add align_ptr_up() for use with pointers. Expose all helpers so that they can be used by tests and/or other utilities. The align_down() helper in particular will be used to ensure gpa alignment for hugepages. No functional change intended. [Added sepearate up/down helpers and replaced open-coded alignment bit math throughout the KVM selftests.] Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Message-Id: <20211111000310.1435032-3-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16KVM: selftests: Explicitly state indicies for vm_guest_mode_params arraySean Christopherson
Explicitly state the indices when populating vm_guest_mode_params to make it marginally easier to visualize what's going on. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> [Added indices for new guest modes.] Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20211111000310.1435032-2-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16KVM: selftests: Add event channel upcall support to xen_shinfo_testDavid Woodhouse
When I first looked at this, there was no support for guest exception handling in the KVM selftests. In fact it was merged into 5.10 before the Xen support got merged in 5.11, and I could have used it from the start. Hook it up now, to exercise the Xen upcall delivery. I'm about to make things a bit more interesting by handling the full 2level event channel stuff in-kernel on top of the basic vector injection that we already have, and I'll want to build more tests on top. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Message-Id: <20211115165030.7422-3-dwmw2@infradead.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-16udp: Validate checksum in udp_read_sock()Cong Wang
It turns out the skb's in sock receive queue could have bad checksums, as both ->poll() and ->recvmsg() validate checksums. We have to do the same for ->read_sock() path too before they are redirected in sockmap. Fixes: d7f571188ecf ("udp: Implement ->read_sock() for sockmap") Reported-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20211115044006.26068-1-xiyou.wangcong@gmail.com
2021-11-16s390: wire up sys_futex_waitv system callVasily Gorbik
Tested with futex kselftests. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-11-16s390/vdso: filter out -mstack-guard and -mstack-sizeSven Schnelle
When CONFIG_VMAP_STACK is disabled, the user can enable CONFIG_STACK_CHECK, which adds a stack overflow check to each C function in the kernel. This is also done for functions in the vdso page. These functions are run in user context and user stack sizes are usually different to what the kernel uses. This might trigger the stack check although the stack size is valid. Therefore filter the -mstack-guard and -mstack-size flags when compiling vdso C files. Cc: stable@kernel.org # 5.10+ Fixes: 4bff8cb54502 ("s390: convert to GENERIC_VDSO") Reported-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-11-16s390/vdso: remove -nostdlib compiler flagMasahiro Yamada
The -nostdlib option requests the compiler to not use the standard system startup files or libraries when linking. It is effective only when $(CC) is used as a linker driver. Since commit 2b2a25845d53 ("s390/vdso: Use $(LD) instead of $(CC) to link vDSO"), $(LD) is directly used, hence -nostdlib is unneeded. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20211107162111.323701-1-masahiroy@kernel.org Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-11-16s390: replace snprintf in show functions with sysfs_emitQing Wang
show() must not use snprintf() when formatting the value to be returned to user space. Fix the coccicheck warnings: WARNING: use scnprintf or sprintf. Use sysfs_emit instead of scnprintf or sprintf makes more sense. Signed-off-by: Qing Wang <wangqing@vivo.com> Acked-by: Vineeth Vijayan <vneethv@linux.ibm.com> Acked-by: Stefan Haberland <sth@linux.ibm.com> Link: https://lore.kernel.org/r/1634280655-4908-1-git-send-email-wangqing@vivo.com [hca@linux.ibm.com: fix indentation] Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-11-16s390/boot: simplify and fix kernel memory layout setupVasily Gorbik
Initial KASAN shadow memory range was picked to preserve original kernel modules area position. With protected execution support, which might impose addressing limitation on vmalloc area and hence affect modules area position, current fixed KASAN shadow memory range is only making kernel memory layout setup more complex. So move it to the very end of available virtual space and simplify calculations. At the same time return to previous kernel address space split. In particular commit 0c4f2623b957 ("s390: setup kernel memory layout early") introduced precise identity map size calculation and keeping vmemmap left most starting from a fresh region table entry. This didn't take into account additional mapping region requirement for potential DCSS mapping above available physical memory. So go back to virtual space split between 1:1 mapping & vmemmap array once vmalloc area size is subtracted. Cc: stable@vger.kernel.org Fixes: 0c4f2623b957 ("s390: setup kernel memory layout early") Reported-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-11-16s390/setup: re-arrange memblock setupVasily Gorbik
- Avoid using ULONG_MAX in memblock_remove, it has no functional change but makes memblock_dbg output a range which makes sense. - Actually finish memblock memory setup before doing amode31/cr/uv setup. - Move memblock_dump_all() debug output after memblock memory setup is complete. This gives us final "memory" regions if they were trimmed due to addressing limits and still "physmem" regions as original info which came from mem_detect. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-11-16s390/setup: avoid using memblock_enforce_memory_limitVasily Gorbik
There is a difference in how architectures treat "mem=" option. For some that is an amount of online memory, for s390 and x86 this is the limiting max address. Some memblock api like memblock_enforce_memory_limit() take limit argument and explicitly treat it as the size of online memory, and use __find_max_addr to convert it to an actual max address. Current s390 usage: memblock_enforce_memory_limit(memblock_end_of_DRAM()); yields different results depending on presence of memory holes (offline memory blocks in between online memory). If there are no memory holes limit == max_addr in memblock_enforce_memory_limit() and it does trim online memory and reserved memory regions. With memory holes present it actually does nothing. Since we already use memblock_remove() explicitly to trim online memory regions to potential limit (think mem=, kdump, addressing limits, etc.) drop the usage of memblock_enforce_memory_limit() altogether. Trimming reserved regions should not be required, since we now use memblock_set_current_limit() to limit allocations and any explicit memory reservations above the limit is an actual problem we should not hide. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-11-16s390/setup: avoid reserving memory above identity mappingVasily Gorbik
Such reserved memory region, if not cleaned up later causes problems when memblock_free_all() is called to release free pages to the buddy allocator and those reserved regions are carried over to reserve_bootmem_region() which marks the pages as PageReserved. Instead use memblock_set_current_limit() to make sure memblock allocations do not go over identity mapping (which could happen when "mem=" option is used or during kdump). Cc: stable@vger.kernel.org Fixes: 73045a08cf55 ("s390: unify identity mapping limits handling") Reported-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-11-16powerpc/8xx: Fix pinned TLBs with CONFIG_STRICT_KERNEL_RWXChristophe Leroy
As spotted and explained in commit c12ab8dbc492 ("powerpc/8xx: Fix Oops with STRICT_KERNEL_RWX without DEBUG_RODATA_TEST"), the selection of STRICT_KERNEL_RWX without selecting DEBUG_RODATA_TEST has spotted the lack of the DIRTY bit in the pinned kernel data TLBs. This problem should have been detected a lot earlier if things had been working as expected. But due to an incredible level of chance or mishap, this went undetected because of a set of bugs: In fact the DTLBs were not pinned, because instead of setting the reserve bit in MD_CTR, it was set in MI_CTR that is the register for ITLBs. But then, another huge bug was there: the physical address was reset to 0 at the boundary between RO and RW areas, leading to the same physical space being mapped at both 0xc0000000 and 0xc8000000. This had by miracle no consequence until now because the entry was not really pinned so it was overwritten soon enough to go undetected. Of course, now that we really pin the DTLBs, it must be fixed as well. Fixes: f76c8f6d257c ("powerpc/8xx: Add function to set pinned TLBs") Cc: stable@vger.kernel.org # v5.8+ Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Depends-on: c12ab8dbc492 ("powerpc/8xx: Fix Oops with STRICT_KERNEL_RWX without DEBUG_RODATA_TEST") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a21e9a057fe2d247a535aff0d157a54eefee017a.1636963688.git.christophe.leroy@csgroup.eu
2021-11-16powerpc/signal32: Fix sigset_t copyChristophe Leroy
The conversion from __copy_from_user() to __get_user() by commit d3ccc9781560 ("powerpc/signal: Use __get_user() to copy sigset_t") introduced a regression in __get_user_sigset() for powerpc/32. The bug was subsequently moved into unsafe_get_user_sigset(). The bug is due to the copied 64 bit value being truncated to 32 bits while being assigned to dst->sig[0] The regression was reported by users of the Xorg packages distributed in Debian/powerpc -- "The symptoms are that the fb screen goes blank, with the backlight remaining on and no errors logged in /var/log; wdm (or startx) run with no effect (I tried logging in in the blind, with no effect). And they are hard to kill, requiring 'kill -KILL ...'" Fix the regression by copying each word of the sigset, not only the first one. __get_user_sigset() was tentatively optimised to copy 64 bits at once in order to minimise KUAP unlock/lock impact, but the unsafe variant doesn't suffer that, so it can just copy words. Fixes: 887f3ceb51cd ("powerpc/signal32: Convert do_setcontext[_tm]() to user access block") Cc: stable@vger.kernel.org # v5.13+ Reported-by: Finn Thain <fthain@linux-m68k.org> Reported-and-tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/99ef38d61c0eb3f79c68942deb0c35995a93a777.1636966353.git.christophe.leroy@csgroup.eu
2021-11-16powerpc/book3e: Fix TLBCAM preset at bootChristophe Leroy
Commit 52bda69ae8b5 ("powerpc/fsl_booke: Tell map_mem_in_cams() if init is done") was supposed to just add an additional parameter to map_mem_in_cams() and always set it to 'true' at that time. But a few call sites were messed up. Fix them. Fixes: 52bda69ae8b5 ("powerpc/fsl_booke: Tell map_mem_in_cams() if init is done") Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Christian Zigotzky <chzigotzky@xenosoft.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d319f2a9367d4d08fd2154e506101bd5f100feeb.1636967119.git.christophe.leroy@csgroup.eu
2021-11-16platform/x86: thinkpad_acpi: fix documentation for adaptive keyboardVincent Bernat
The different values were offset by 1. 0 is for "home mode", 1 for "web-browser mode", etc. Moreover, the URL to the laptop's user guide did not work anymore. Signed-off-by: Vincent Bernat <vincent@bernat.ch> Link: https://lore.kernel.org/r/20211109195209.176905-1-vincent@bernat.ch Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-11-16platform/x86: thinkpad_acpi: Fix WWAN device disabled issue after S3 deepSlark Xiao
When WWAN device wake from S3 deep, under thinkpad platform, WWAN would be disabled. This disable status could be checked by command 'nmcli r wwan' or 'rfkill list'. Issue analysis as below: When host resume from S3 deep, thinkpad_acpi driver would call hotkey_resume() function. Finnaly, it will use wan_get_status to check the current status of WWAN device. During this resume progress, wan_get_status would always return off even WWAN boot up completely. In patch V2, Hans said 'sw_state should be unchanged after a suspend/resume. It's better to drop the tpacpi_rfk_update_swstate call all together from the resume path'. And it's confimed by Lenovo that GWAN is no longer available from WHL generation because the design does not match with current pin control. Signed-off-by: Slark Xiao <slark_xiao@163.com> Link: https://lore.kernel.org/r/20211108060648.8212-1-slark_xiao@163.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-11-16platform/x86: thinkpad_acpi: Add support for dual fan controlJimmy Wang
This adds dual fan control for P1 / X1 Extreme Gen4 Signed-off-by: Jimmy Wang <jimmy221b@163.com> Link: https://lore.kernel.org/r/20211105090528.39677-1-jimmy221b@163.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-11-16platform/x86: think-lmi: Abort probe on analyze failureAlex Williamson
A Lenovo ThinkStation S20 (4157CTO BIOS 60KT41AUS) fails to boot on recent kernels including the think-lmi driver, due to the fact that errors returned by the tlmi_analyze() function are ignored by tlmi_probe(), where tlmi_sysfs_init() is called unconditionally. This results in making use of an array of already freed, non-null pointers and other uninitialized globals, causing all sorts of nasty kobject and memory faults. Make use of the analyze function return value, free a couple leaked allocations, and remove the settings_count field, which is incremented but never consumed. Fixes: a40cd7ef22fb ("platform/x86: think-lmi: Add WMI interface support on Lenovo platforms") Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Mark Gross <markgross@kernel.org> Reviewed-by: Mark Pearson <markpearson@lenovo.com> Link: https://lore.kernel.org/r/163639463588.1330483.15850167112490200219.stgit@omen Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-11-16platform/x86: dell-wmi-descriptor: disable by defaultThomas Weißschuh
dell-wmi-descriptor only provides symbols to other drivers. These drivers already select dell-wmi-descriptor when needed. This fixes an issue where dell-wmi-descriptor is compiled as a module with localyesconfig on a non-Dell machine. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20211113080551.61860-1-linux@weissschuh.net Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-11-16platform/x86: samsung-laptop: Fix typo in a commentJason Wang
The double `it' is repeated in a comment, therefore one of them is removed. Signed-off-by: Jason Wang <wangborong@cdjrlc.com> Link: https://lore.kernel.org/r/20211113054827.199517-1-wangborong@cdjrlc.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-11-16platform/x86: hp_accel: Fix an error handling path in 'lis3lv02d_probe()'Christophe JAILLET
If 'led_classdev_register()' fails, some additional resources should be released. Add the missing 'i8042_remove_filter()' and 'lis3lv02d_remove_fs()' calls that are already in the remove function but are missing here. Fixes: a4c724d0723b ("platform: hp_accel: add a i8042 filter to remove HPQ6000 data from kb bus stream") Fixes: 9e0c79782143 ("lis3lv02d: merge with leds hp disk") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/5a4f218f8f16d2e3a7906b7ca3654ffa946895f8.1636314074.git.christophe.jaillet@wanadoo.fr Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-11-16platform/x86: amd-pmc: Make CONFIG_AMD_PMC depend on RTC_CLASSHans de Goede
Since the "Add special handling for timer based S0i3 wakeup" changes the amd-pmc code now relies on symbols from the RTC-class code, add a dependency for this to Kconfig. Fixes: 59348401ebed ("platform/x86: amd-pmc: Add special handling for timer based S0i3 wakeup") Cc: Mario Limonciello <mario.limonciello@amd.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20211102153256.76956-1-hdegoede@redhat.com
2021-11-16platform/mellanox: mlxreg-lc: fix error code in ↵Dan Carpenter
mlxreg_lc_create_static_devices() This code should be using PTR_ERR() instead of IS_ERR(). And because it's using the wrong "dev->client" pointer, the IS_ERR() check will be false, meaning the function returns success. Fixes: 62f9529b8d5c ("platform/mellanox: mlxreg-lc: Add initial support for Nvidia line card devices") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Vadim Pasternak <vadimp@nvidia.com> Link: https://lore.kernel.org/r/20211110074346.GB5176@kili Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-11-16drm/scheduler: fix drm_sched_job_add_implicit_dependenciesChristian König
Trivial fix since we now need to grab a reference to the fence we have added. Previously the dma_resv function where doing that for us. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: 9c2ba265352a ("drm/scheduler: use new iterator in drm_sched_job_add_implicit_dependencies v2") Link: https://patchwork.freedesktop.org/patch/msgid/20211019112706.27769-1-christian.koenig@amd.com Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reported-by: Nicolas Frattaroli <frattaroli.nicolas@gmail.com> References: https://lore.kernel.org/dri-devel/2023306.UmlnhvANQh@archbook/ Tested-by: Nicolas Frattaroli <frattaroli.nicolas@gmail.com> Tested-by: Yassine Oudjana <y.oudjana@protonmail.com>
2021-11-16gpio: rockchip: needs GENERIC_IRQ_CHIP to fix build errorsRandy Dunlap
gpio-rockchip uses interfaces that are provided by the Kconfig symbol GENERIC_IRQ_CHIP, so the driver should select that symbol in order to prevent build errors. Fixes these build errors (and more): aarch64-linux-ld: drivers/gpio/gpio-rockchip.o: in function `rockchip_irq_disable': gpio-rockchip.c:(.text+0x454): undefined reference to `irq_gc_mask_set_bit' aarch64-linux-ld: drivers/gpio/gpio-rockchip.o: in function `rockchip_irq_enable': gpio-rockchip.c:(.text+0x478): undefined reference to `irq_gc_mask_clr_bit' aarch64-linux-ld: drivers/gpio/gpio-rockchip.o: in function `rockchip_interrupts_register': gpio-rockchip.c:(.text+0x518): undefined reference to `irq_generic_chip_ops' aarch64-linux-ld: gpio-rockchip.c:(.text+0x594): undefined reference to `__irq_alloc_domain_generic_chips' aarch64-linux-ld: gpio-rockchip.c:(.text+0x5cc): undefined reference to `irq_get_domain_generic_chip' aarch64-linux-ld: gpio-rockchip.c:(.text+0x5e0): undefined reference to `irq_gc_ack_set_bit' aarch64-linux-ld: gpio-rockchip.c:(.text+0x604): undefined reference to `irq_gc_set_wake' Fixes: 936ee2675eee ("gpio/rockchip: add driver for rockchip gpio") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2021-11-16mips: lantiq: add support for clk_get_parent()Randy Dunlap
Provide a simple implementation of clk_get_parent() in the lantiq subarch so that callers of it will build without errors. Fixes this build error: ERROR: modpost: "clk_get_parent" [drivers/iio/adc/ingenic-adc.ko] undefined! Fixes: 171bb2f19ed6 ("MIPS: Lantiq: Add initial support for Lantiq SoCs") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Suggested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Cc: linux-mips@vger.kernel.org Cc: John Crispin <john@phrozen.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Jonathan Cameron <jic23@kernel.org> Cc: linux-iio@vger.kernel.org Cc: Russell King <linux@armlinux.org.uk> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: John Crispin <john@phrozen.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-11-16mips: bcm63xx: add support for clk_get_parent()Randy Dunlap
BCM63XX selects HAVE_LEGACY_CLK but does not provide/support clk_get_parent(), so add a simple implementation of that function so that callers of it will build without errors. Fixes these build errors: mips-linux-ld: drivers/iio/adc/ingenic-adc.o: in function `jz4770_adc_init_clk_div': ingenic-adc.c:(.text+0xe4): undefined reference to `clk_get_parent' mips-linux-ld: drivers/iio/adc/ingenic-adc.o: in function `jz4725b_adc_init_clk_div': ingenic-adc.c:(.text+0x1b8): undefined reference to `clk_get_parent' Fixes: e7300d04bd08 ("MIPS: BCM63xx: Add support for the Broadcom BCM63xx family of SOCs." ) Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Suggested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Cc: Artur Rojek <contact@artur-rojek.eu> Cc: Paul Cercueil <paul@crapouillou.net> Cc: linux-mips@vger.kernel.org Cc: Jonathan Cameron <jic23@kernel.org> Cc: Lars-Peter Clausen <lars@metafoo.de> Cc: linux-iio@vger.kernel.org Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Russell King <linux@armlinux.org.uk> Cc: bcm-kernel-feedback-list@broadcom.com Cc: Jonas Gorski <jonas.gorski@gmail.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-11-16MIPS: generic/yamon-dt: fix uninitialized variable errorColin Ian King
In the case where fw_getenv returns an error when fetching values for ememsizea and memsize then variable phys_memsize is not assigned a variable and will be uninitialized on a zero check of phys_memsize. Fix this by initializing phys_memsize to zero. Cleans up cppcheck error: arch/mips/generic/yamon-dt.c:100:7: error: Uninitialized variable: phys_memsize [uninitvar] Fixes: f41d2430bbd6 ("MIPS: generic/yamon-dt: Support > 256MB of RAM") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-11-16MIPS: syscalls: Wire up futex_waitv syscallWang Haojun
Wire up the futex_waitv syscall. Fix Build warning: #warning syscall futex_waitv not implemented [-Wcpp] Signed-off-by: Wang Haojun <wanghaojun@loongson.cn> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-11-16Merge drm/drm-fixes into drm-misc-fixesMaxime Ripard
We need -rc1 to address a breakage in drm/scheduler affecting panfrost. Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2021-11-15bpf: Fix toctou on read-only map's constant scalar trackingDaniel Borkmann
Commit a23740ec43ba ("bpf: Track contents of read-only maps as scalars") is checking whether maps are read-only both from BPF program side and user space side, and then, given their content is constant, reading out their data via map->ops->map_direct_value_addr() which is then subsequently used as known scalar value for the register, that is, it is marked as __mark_reg_known() with the read value at verification time. Before a23740ec43ba, the register content was marked as an unknown scalar so the verifier could not make any assumptions about the map content. The current implementation however is prone to a TOCTOU race, meaning, the value read as known scalar for the register is not guaranteed to be exactly the same at a later point when the program is executed, and as such, the prior made assumptions of the verifier with regards to the program will be invalid which can cause issues such as OOB access, etc. While the BPF_F_RDONLY_PROG map flag is always fixed and required to be specified at map creation time, the map->frozen property is initially set to false for the map given the map value needs to be populated, e.g. for global data sections. Once complete, the loader "freezes" the map from user space such that no subsequent updates/deletes are possible anymore. For the rest of the lifetime of the map, this freeze one-time trigger cannot be undone anymore after a successful BPF_MAP_FREEZE cmd return. Meaning, any new BPF_* cmd calls which would update/delete map entries will be rejected with -EPERM since map_get_sys_perms() removes the FMODE_CAN_WRITE permission. This also means that pending update/delete map entries must still complete before this guarantee is given. This corner case is not an issue for loaders since they create and prepare such program private map in successive steps. However, a malicious user is able to trigger this TOCTOU race in two different ways: i) via userfaultfd, and ii) via batched updates. For i) userfaultfd is used to expand the competition interval, so that map_update_elem() can modify the contents of the map after map_freeze() and bpf_prog_load() were executed. This works, because userfaultfd halts the parallel thread which triggered a map_update_elem() at the time where we copy key/value from the user buffer and this already passed the FMODE_CAN_WRITE capability test given at that time the map was not "frozen". Then, the main thread performs the map_freeze() and bpf_prog_load(), and once that had completed successfully, the other thread is woken up to complete the pending map_update_elem() which then changes the map content. For ii) the idea of the batched update is similar, meaning, when there are a large number of updates to be processed, it can increase the competition interval between the two. It is therefore possible in practice to modify the contents of the map after executing map_freeze() and bpf_prog_load(). One way to fix both i) and ii) at the same time is to expand the use of the map's map->writecnt. The latter was introduced in fc9702273e2e ("bpf: Add mmap() support for BPF_MAP_TYPE_ARRAY") and further refined in 1f6cb19be2e2 ("bpf: Prevent re-mmap()'ing BPF map as writable for initially r/o mapping") with the rationale to make a writable mmap()'ing of a map mutually exclusive with read-only freezing. The counter indicates writable mmap() mappings and then prevents/fails the freeze operation. Its semantics can be expanded beyond just mmap() by generally indicating ongoing write phases. This would essentially span any parallel regular and batched flavor of update/delete operation and then also have map_freeze() fail with -EBUSY. For the check_mem_access() in the verifier we expand upon the bpf_map_is_rdonly() check ensuring that all last pending writes have completed via bpf_map_write_active() test. Once the map->frozen is set and bpf_map_write_active() indicates a map->writecnt of 0 only then we are really guaranteed to use the map's data as known constants. For map->frozen being set and pending writes in process of still being completed we fall back to marking that register as unknown scalar so we don't end up making assumptions about it. With this, both TOCTOU reproducers from i) and ii) are fixed. Note that the map->writecnt has been converted into a atomic64 in the fix in order to avoid a double freeze_mutex mutex_{un,}lock() pair when updating map->writecnt in the various map update/delete BPF_* cmd flavors. Spanning the freeze_mutex over entire map update/delete operations in syscall side would not be possible due to then causing everything to be serialized. Similarly, something like synchronize_rcu() after setting map->frozen to wait for update/deletes to complete is not possible either since it would also have to span the user copy which can sleep. On the libbpf side, this won't break d66562fba1ce ("libbpf: Add BPF object skeleton support") as the anonymous mmap()-ed "map initialization image" is remapped as a BPF map-backed mmap()-ed memory where for .rodata it's non-writable. Fixes: a23740ec43ba ("bpf: Track contents of read-only maps as scalars") Reported-by: w1tcher.bupt@gmail.com Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-11-15samples/bpf: Fix build error due to -isystem removalAlexander Lobakin
Since recent Kbuild updates we no longer include files from compiler directories. However, samples/bpf/hbm_kern.h hasn't been tuned for this (LLVM 13): CLANG-bpf samples/bpf/hbm_out_kern.o In file included from samples/bpf/hbm_out_kern.c:55: samples/bpf/hbm_kern.h:12:10: fatal error: 'stddef.h' file not found ^~~~~~~~~~ 1 error generated. CLANG-bpf samples/bpf/hbm_edt_kern.o In file included from samples/bpf/hbm_edt_kern.c:53: samples/bpf/hbm_kern.h:12:10: fatal error: 'stddef.h' file not found ^~~~~~~~~~ 1 error generated. It is enough to just drop both stdbool.h and stddef.h from includes to fix those. Fixes: 04e85bbf71c9 ("isystem: delete global -isystem compile option") Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://lore.kernel.org/bpf/20211115130741.3584-1-alexandr.lobakin@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-11-15Merge branch 'Forbid bpf_ktime_get_coarse_ns and bpf_timer_* in tracing progs'Alexei Starovoitov
Dmitrii Banshchikov says: ==================== Various locking issues are possible with bpf_ktime_get_coarse_ns() and bpf_timer_* set of helpers. syzbot found a locking issue with bpf_ktime_get_coarse_ns() helper executed in BPF_PROG_TYPE_PERF_EVENT prog type - [1]. The issue is possible because the helper uses non fast version of time accessor that isn't safe for any context. The helper was added because it provided performance benefits in comparison to bpf_ktime_get_ns() helper. A similar locking issue is possible with bpf_timer_* set of helpers when used in tracing progs. The solution is to restrict use of the helpers in tracing progs. In the [1] discussion it was stated that bpf_spin_lock related helpers shall also be excluded for tracing progs. The verifier has a compatibility check between a map and a program. If a tracing program tries to use a map which value has struct bpf_spin_lock the verifier fails that is why bpf_spin_lock is already restricted. Patch 1 restricts helpers Patch 2 adds tests v1 -> v2: * Limit the helpers via func proto getters instead of allowed callback * Add note about helpers' restrictions to linux/bpf.h * Add Fixes tag * Remove extra \0 from btf_str_sec * Beside asm tests add prog tests * Trim CC 1. https://lore.kernel.org/all/00000000000013aebd05cff8e064@google.com/ ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-11-15selftests/bpf: Add tests for restricted helpersDmitrii Banshchikov
This patch adds tests that bpf_ktime_get_coarse_ns(), bpf_timer_* and bpf_spin_lock()/bpf_spin_unlock() helpers are forbidden in tracing progs as their use there may result in various locking issues. Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211113142227.566439-3-me@ubique.spb.ru
2021-11-15bpf: Forbid bpf_ktime_get_coarse_ns and bpf_timer_* in tracing progsDmitrii Banshchikov
Use of bpf_ktime_get_coarse_ns() and bpf_timer_* helpers in tracing progs may result in locking issues. bpf_ktime_get_coarse_ns() uses ktime_get_coarse_ns() time accessor that isn't safe for any context: ====================================================== WARNING: possible circular locking dependency detected 5.15.0-syzkaller #0 Not tainted ------------------------------------------------------ syz-executor.4/14877 is trying to acquire lock: ffffffff8cb30008 (tk_core.seq.seqcount){----}-{0:0}, at: ktime_get_coarse_ts64+0x25/0x110 kernel/time/timekeeping.c:2255 but task is already holding lock: ffffffff90dbf200 (&obj_hash[i].lock){-.-.}-{2:2}, at: debug_object_deactivate+0x61/0x400 lib/debugobjects.c:735 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&obj_hash[i].lock){-.-.}-{2:2}: lock_acquire+0x19f/0x4d0 kernel/locking/lockdep.c:5625 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0xd1/0x120 kernel/locking/spinlock.c:162 __debug_object_init+0xd9/0x1860 lib/debugobjects.c:569 debug_hrtimer_init kernel/time/hrtimer.c:414 [inline] debug_init kernel/time/hrtimer.c:468 [inline] hrtimer_init+0x20/0x40 kernel/time/hrtimer.c:1592 ntp_init_cmos_sync kernel/time/ntp.c:676 [inline] ntp_init+0xa1/0xad kernel/time/ntp.c:1095 timekeeping_init+0x512/0x6bf kernel/time/timekeeping.c:1639 start_kernel+0x267/0x56e init/main.c:1030 secondary_startup_64_no_verify+0xb1/0xbb -> #0 (tk_core.seq.seqcount){----}-{0:0}: check_prev_add kernel/locking/lockdep.c:3051 [inline] check_prevs_add kernel/locking/lockdep.c:3174 [inline] validate_chain+0x1dfb/0x8240 kernel/locking/lockdep.c:3789 __lock_acquire+0x1382/0x2b00 kernel/locking/lockdep.c:5015 lock_acquire+0x19f/0x4d0 kernel/locking/lockdep.c:5625 seqcount_lockdep_reader_access+0xfe/0x230 include/linux/seqlock.h:103 ktime_get_coarse_ts64+0x25/0x110 kernel/time/timekeeping.c:2255 ktime_get_coarse include/linux/timekeeping.h:120 [inline] ktime_get_coarse_ns include/linux/timekeeping.h:126 [inline] ____bpf_ktime_get_coarse_ns kernel/bpf/helpers.c:173 [inline] bpf_ktime_get_coarse_ns+0x7e/0x130 kernel/bpf/helpers.c:171 bpf_prog_a99735ebafdda2f1+0x10/0xb50 bpf_dispatcher_nop_func include/linux/bpf.h:721 [inline] __bpf_prog_run include/linux/filter.h:626 [inline] bpf_prog_run include/linux/filter.h:633 [inline] BPF_PROG_RUN_ARRAY include/linux/bpf.h:1294 [inline] trace_call_bpf+0x2cf/0x5d0 kernel/trace/bpf_trace.c:127 perf_trace_run_bpf_submit+0x7b/0x1d0 kernel/events/core.c:9708 perf_trace_lock+0x37c/0x440 include/trace/events/lock.h:39 trace_lock_release+0x128/0x150 include/trace/events/lock.h:58 lock_release+0x82/0x810 kernel/locking/lockdep.c:5636 __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:149 [inline] _raw_spin_unlock_irqrestore+0x75/0x130 kernel/locking/spinlock.c:194 debug_hrtimer_deactivate kernel/time/hrtimer.c:425 [inline] debug_deactivate kernel/time/hrtimer.c:481 [inline] __run_hrtimer kernel/time/hrtimer.c:1653 [inline] __hrtimer_run_queues+0x2f9/0xa60 kernel/time/hrtimer.c:1749 hrtimer_interrupt+0x3b3/0x1040 kernel/time/hrtimer.c:1811 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1086 [inline] __sysvec_apic_timer_interrupt+0xf9/0x270 arch/x86/kernel/apic/apic.c:1103 sysvec_apic_timer_interrupt+0x8c/0xb0 arch/x86/kernel/apic/apic.c:1097 asm_sysvec_apic_timer_interrupt+0x12/0x20 __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:152 [inline] _raw_spin_unlock_irqrestore+0xd4/0x130 kernel/locking/spinlock.c:194 try_to_wake_up+0x702/0xd20 kernel/sched/core.c:4118 wake_up_process kernel/sched/core.c:4200 [inline] wake_up_q+0x9a/0xf0 kernel/sched/core.c:953 futex_wake+0x50f/0x5b0 kernel/futex/waitwake.c:184 do_futex+0x367/0x560 kernel/futex/syscalls.c:127 __do_sys_futex kernel/futex/syscalls.c:199 [inline] __se_sys_futex+0x401/0x4b0 kernel/futex/syscalls.c:180 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae There is a possible deadlock with bpf_timer_* set of helpers: hrtimer_start() lock_base(); trace_hrtimer...() perf_event() bpf_run() bpf_timer_start() hrtimer_start() lock_base() <- DEADLOCK Forbid use of bpf_ktime_get_coarse_ns() and bpf_timer_* helpers in BPF_PROG_TYPE_KPROBE, BPF_PROG_TYPE_TRACEPOINT, BPF_PROG_TYPE_PERF_EVENT and BPF_PROG_TYPE_RAW_TRACEPOINT prog types. Fixes: d05512618056 ("bpf: Add bpf_ktime_get_coarse_ns helper") Fixes: b00628b1c7d5 ("bpf: Introduce bpf timers.") Reported-by: syzbot+43fd005b5a1b4d10781e@syzkaller.appspotmail.com Signed-off-by: Dmitrii Banshchikov <me@ubique.spb.ru> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211113142227.566439-2-me@ubique.spb.ru
2021-11-15ARM: dts: bcm2711: Fix PCIe interruptsFlorian Fainelli
The PCIe host bridge has two interrupt lines, one that goes towards it PCIE_INTR2 second level interrupt controller and one for its MSI second level interrupt controller. The first interrupt line is not currently managed by the driver, which is why it was not a functional problem. The interrupt-map property was also only listing the PCI_INTA interrupts when there are also the INTB, C and D. Reported-by: Jim Quinlan <jim2101024@gmail.com> Fixes: d5c8dc0d4c88 ("ARM: dts: bcm2711: Enable PCIe controller") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
2021-11-15ARM: dts: BCM5301X: Add interrupt properties to GPIO nodeFlorian Fainelli
The GPIO controller is also an interrupt controller provider and is currently missing the appropriate 'interrupt-controller' and '#interrupt-cells' properties to denote that. Fixes: fb026d3de33b ("ARM: BCM5301X: Add Broadcom's bus-axi to the DTS file") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
2021-11-15ARM: dts: BCM5301X: Fix I2C controller interruptFlorian Fainelli
The I2C interrupt controller line is off by 32 because the datasheet describes interrupt inputs into the GIC which are for Shared Peripheral Interrupts and are starting at offset 32. The ARM GIC binding expects the SPI interrupts to be numbered from 0 relative to the SPI base. Fixes: bb097e3e0045 ("ARM: dts: BCM5301X: Add I2C support to the DT") Tested-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
2021-11-15blk-mq: cancel blk-mq dispatch work in both blk_cleanup_queue and disk_release()Ming Lei
For avoiding to slow down queue destroy, we don't call blk_mq_quiesce_queue() in blk_cleanup_queue(), instead of delaying to cancel dispatch work in blk_release_queue(). However, this way has caused kernel oops[1], reported by Changhui. The log shows that scsi_device can be freed before running blk_release_queue(), which is expected too since scsi_device is released after the scsi disk is closed and the scsi_device is removed. Fixes the issue by canceling blk-mq dispatch work in both blk_cleanup_queue() and disk_release(): 1) when disk_release() is run, the disk has been closed, and any sync dispatch activities have been done, so canceling dispatch work is enough to quiesce filesystem I/O dispatch activity. 2) in blk_cleanup_queue(), we only focus on passthrough request, and passthrough request is always explicitly allocated & freed by its caller, so once queue is frozen, all sync dispatch activity for passthrough request has been done, then it is enough to just cancel dispatch work for avoiding any dispatch activity. [1] kernel panic log [12622.769416] BUG: kernel NULL pointer dereference, address: 0000000000000300 [12622.777186] #PF: supervisor read access in kernel mode [12622.782918] #PF: error_code(0x0000) - not-present page [12622.788649] PGD 0 P4D 0 [12622.791474] Oops: 0000 [#1] PREEMPT SMP PTI [12622.796138] CPU: 10 PID: 744 Comm: kworker/10:1H Kdump: loaded Not tainted 5.15.0+ #1 [12622.804877] Hardware name: Dell Inc. PowerEdge R730/0H21J3, BIOS 1.5.4 10/002/2015 [12622.813321] Workqueue: kblockd blk_mq_run_work_fn [12622.818572] RIP: 0010:sbitmap_get+0x75/0x190 [12622.823336] Code: 85 80 00 00 00 41 8b 57 08 85 d2 0f 84 b1 00 00 00 45 31 e4 48 63 cd 48 8d 1c 49 48 c1 e3 06 49 03 5f 10 4c 8d 6b 40 83 f0 01 <48> 8b 33 44 89 f2 4c 89 ef 0f b6 c8 e8 fa f3 ff ff 83 f8 ff 75 58 [12622.844290] RSP: 0018:ffffb00a446dbd40 EFLAGS: 00010202 [12622.850120] RAX: 0000000000000001 RBX: 0000000000000300 RCX: 0000000000000004 [12622.858082] RDX: 0000000000000006 RSI: 0000000000000082 RDI: ffffa0b7a2dfe030 [12622.866042] RBP: 0000000000000004 R08: 0000000000000001 R09: ffffa0b742721334 [12622.874003] R10: 0000000000000008 R11: 0000000000000008 R12: 0000000000000000 [12622.881964] R13: 0000000000000340 R14: 0000000000000000 R15: ffffa0b7a2dfe030 [12622.889926] FS: 0000000000000000(0000) GS:ffffa0baafb40000(0000) knlGS:0000000000000000 [12622.898956] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [12622.905367] CR2: 0000000000000300 CR3: 0000000641210001 CR4: 00000000001706e0 [12622.913328] Call Trace: [12622.916055] <TASK> [12622.918394] scsi_mq_get_budget+0x1a/0x110 [12622.922969] __blk_mq_do_dispatch_sched+0x1d4/0x320 [12622.928404] ? pick_next_task_fair+0x39/0x390 [12622.933268] __blk_mq_sched_dispatch_requests+0xf4/0x140 [12622.939194] blk_mq_sched_dispatch_requests+0x30/0x60 [12622.944829] __blk_mq_run_hw_queue+0x30/0xa0 [12622.949593] process_one_work+0x1e8/0x3c0 [12622.954059] worker_thread+0x50/0x3b0 [12622.958144] ? rescuer_thread+0x370/0x370 [12622.962616] kthread+0x158/0x180 [12622.966218] ? set_kthread_struct+0x40/0x40 [12622.970884] ret_from_fork+0x22/0x30 [12622.974875] </TASK> [12622.977309] Modules linked in: scsi_debug rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs sunrpc dm_multipath intel_rapl_msr intel_rapl_common dell_wmi_descriptor sb_edac rfkill video x86_pkg_temp_thermal intel_powerclamp dcdbas coretemp kvm_intel kvm mgag200 irqbypass i2c_algo_bit rapl drm_kms_helper ipmi_ssif intel_cstate intel_uncore syscopyarea sysfillrect sysimgblt fb_sys_fops pcspkr cec mei_me lpc_ich mei ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter drm fuse xfs libcrc32c sr_mod cdrom sd_mod t10_pi sg ixgbe ahci libahci crct10dif_pclmul crc32_pclmul crc32c_intel libata megaraid_sas ghash_clmulni_intel tg3 wdat_wdt mdio dca wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_debug] Reported-by: ChanghuiZhong <czhong@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Bart Van Assche <bvanassche@acm.org> Cc: linux-scsi@vger.kernel.org Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20211116014343.610501-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-16pinctrl: qcom: sm8350: Correct UFS and SDC offsetsBjorn Andersson
The downstream TLMM binding covers a group of TLMM-related hardware blocks, but the upstream binding only captures the particular block related to controlling the TLMM pins from an OS. In the translation of the driver from downstream, the offset of 0x100000 was lost for the UFS and SDC pingroups. Fixes: d5d348a3271f ("pinctrl: qcom: Add SM8350 pinctrl driver") Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Vinod Koul <vkoul@kernel.org> Reviewed-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org> Link: https://lore.kernel.org/r/20211104170835.1993686-1-bjorn.andersson@linaro.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org>