summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-08-25selftests: xsk: Simplify cleanup of ifobjectsMagnus Karlsson
Simpify the cleanup of ifobjects right before the program exits by introducing functions for creating and destroying these objects. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-13-magnus.karlsson@gmail.com
2021-08-25selftests: xsk: Decrease sending speedMagnus Karlsson
Decrease sending speed to avoid potentially overflowing some buffers in the skb case that leads to dropped packets we cannot control (and thus the tests may generate false negatives). Decrease batch size and introduce a usleep in the transmit thread to not overflow the receiver. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-12-magnus.karlsson@gmail.com
2021-08-25selftests: xsk: Validate tx stats on tx threadMagnus Karlsson
Validate the tx stats on the Tx thread instead of the Rx thread. Depending on your settings, you might not be allowed to query the statistics of a socket you do not own, so better to do this on the correct thread to start with. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-11-magnus.karlsson@gmail.com
2021-08-25selftests: xsk: Simplify packet validation in xsk testsMagnus Karlsson
Simplify packet validation in the xsk selftests by performing it at once for every packet. The current code performed this per batch and did this on copied packet data. Make it simpler and faster by validating it at once and on the umem packet data thus skipping the copy and the memory allocation for the temprary buffer. The optional packet dump feature is also simplified in the same manner. Memory allocation and copying is removed and the dump is performed directly on the umem data. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-10-magnus.karlsson@gmail.com
2021-08-25selftests: xsk: Rename worker_* functions that are not thread entry pointsMagnus Karlsson
Rename worker_* functions that are not thread entry points to something else. This was confusing. Now only thread entry points are worker_something. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-9-magnus.karlsson@gmail.com
2021-08-25selftests: xsk: Disassociate umem size with packets sentMagnus Karlsson
Disassociate the number of packets sent with the number of buffers in the umem. This so we can loop over the umem to test more things. Set the size of the umem to be a multiple of 2M. A requirement for huge pages that are needed in unaligned mode. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-8-magnus.karlsson@gmail.com
2021-08-25selftests: xsk: Remove end-of-test packetMagnus Karlsson
Get rid of the end-of-test packet and just count the number of packets received and quit when the expected number as been received. Simplifies the code. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-7-magnus.karlsson@gmail.com
2021-08-25selftests: xsk: Simplify the retry codeMagnus Karlsson
Simplify the retry code and make it more efficient by waiting first, instead of trying immediately which always fails due to the asynchronous nature of xsk socket close. Also decrease the wait time to significantly lower the run-time of the test suite. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-6-magnus.karlsson@gmail.com
2021-08-25selftests: xsk: Return correct error codesMagnus Karlsson
Return the correct error codes so they can be printed correctly. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-5-magnus.karlsson@gmail.com
2021-08-25selftests: xsk: Remove unused variablesMagnus Karlsson
Remove unused variables and typedefs. The *_npkts variables are incremented but never used. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-4-magnus.karlsson@gmail.com
2021-08-25selftests: xsk: Remove the num_tx_packets optionMagnus Karlsson
Remove the number of tx packet option as this should be decided by the test itself. Also change the number of packets to be sent to 4096 speeding up the execution. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-3-magnus.karlsson@gmail.com
2021-08-25selftests: xsk: Remove color modeMagnus Karlsson
Remove color mode since it does not add any value and having less code means less maintenance which is a good thing. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210825093722.10219-2-magnus.karlsson@gmail.com
2021-08-25io_uring: don't free request to slabHao Xu
It's not necessary to free the request back to slab when we fail to get sqe, just move it to state->free_list. Signed-off-by: Hao Xu <haoxu@linux.alibaba.com> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/20210825175856.194299-1-haoxu@linux.alibaba.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-25hv_utils: Set the maximum packet size for VSS driver to the length of the ↵Vitaly Kuznetsov
receive buffer Commit adae1e931acd ("Drivers: hv: vmbus: Copy packets sent by Hyper-V out of the ring buffer") introduced a notion of maximum packet size and for KVM and FCOPY drivers set it to the length of the receive buffer. VSS driver wasn't updated, this means that the maximum packet size is now VMBUS_DEFAULT_MAX_PKT_SIZE (4k). Apparently, this is not enough. I'm observing a packet of 6304 bytes which is being truncated to 4096. When VSS driver tries to read next packet from ring buffer it starts from the wrong offset and receives garbage. Set the maximum packet size to 'HV_HYP_PAGE_SIZE * 2' in VSS driver. This matches the length of the receive buffer and is in line with other utils drivers. Fixes: adae1e931acd ("Drivers: hv: vmbus: Copy packets sent by Hyper-V out of the ring buffer") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20210825133857.847866-1-vkuznets@redhat.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2021-08-25PM: domains: Improve runtime PM performance state handlingDmitry Osipenko
GENPD core doesn't support handling performance state changes while consumer device is runtime-suspended or when runtime PM is disabled. GENPD core may override performance state that was configured by device driver while RPM of the device was disabled or device was RPM-suspended. Let's close that gap by allowing drivers to control performance state while RPM of a consumer device is disabled and to set up performance state of RPM-suspended device that will be applied by GENPD core on RPM-resume of the device. Fixes: 5937c3ce2122 ("PM: domains: Drop/restore performance state votes for devices at runtime PM") Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-08-25dt-bindings: Add vendor prefix for Topic Embedded SystemsMichal Simek
Add vendor prefix for Topic Embedded Systems (http://topic.nl). Signed-off-by: Michal Simek <michal.simek@xilinx.com> Link: https://lore.kernel.org/r/b6e42012977876c421672a84bdb7636be819d664.1629877585.git.michal.simek@xilinx.com Signed-off-by: Rob Herring <robh@kernel.org>
2021-08-25of: fdt: Rename reserve_elfcorehdr() to fdt_reserve_elfcorehdr()Geert Uytterhoeven
On ia64/allmodconfig: drivers/of/fdt.c:609:20: error: conflicting types for 'reserve_elfcorehdr'; have 'void(void)' 609 | static void __init reserve_elfcorehdr(void) | ^~~~~~~~~~~~~~~~~~ arch/ia64/include/asm/meminit.h:43:12: note: previous declaration of 'reserve_elfcorehdr' with type 'int(u64 *, u64 *)' {aka 'int(long long unsigned int *, long long unsigned int *)'} 43 | extern int reserve_elfcorehdr(u64 *start, u64 *end); | ^~~~~~~~~~~~~~~~~~ Fix this by prefixing the FDT function name with "fdt_". Fixes: f7e7ce93aac13118 ("of: fdt: Add generic support for handling elf core headers property") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/f6eabbbce0fba6da3da0264c1e1cf23c01173999.1629884393.git.geert+renesas@glider.be Signed-off-by: Rob Herring <robh@kernel.org>
2021-08-25powercap: Add Power Limit4 support for Alder Lake SoCSumeet Pawnikar
Add Power Limit4 support for Alder Lake SoC. Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Acked-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-08-25cpufreq: intel_pstate: Process HWP Guaranteed change notificationSrinivas Pandruvada
It is possible that HWP guaranteed ratio is changed in response to change in power and thermal limits. For example when Intel Speed Select performance profile is changed or there is change in TDP, hardware can send notifications. It is possible that the guaranteed ratio is increased. This creates an issue when turbo is disabled, as the old limits set in MSR_HWP_REQUEST are still lower and hardware will clip to older limits. This change enables HWP interrupt and process HWP interrupts. When guaranteed is changed, calls cpufreq_update_policy() so that driver callbacks are called to update to new HWP limits. This callback is called from a delayed workqueue of 10ms to avoid frequent updates. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-08-25thermal: intel: Allow processing of HWP interruptSrinivas Pandruvada
Add a weak function to process HWP (Hardware P-states) notifications and move updating HWP_STATUS MSR to this function. This allows HWP interrupts to be processed by the intel_pstate driver in HWP mode by overriding the implementation. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Acked-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-08-25ACPI: tables: FPDT: Do not print FW_BUG message if record types are reservedAdrian Huang
In ACPI 6.4 spec, record types "0x0002-0xffff" of FPDT Performance Record Types [1] and record types "0x0003-0xffff" of Runtime Performance Record Types [2] are reserved. Users might be confused with the FW_BUG message, and they think this is the FW issue. Here is the example in a Lenovo box: ACPI: FPDT 0x00000000A820A000 000044 (v01 LENOVO THINKSYS 00000100 01000013) ACPI: Reserving FPDT table memory at [mem 0xa820a000-0xa820a043] ACPI FPDT: [Firmware Bug]: Invalid record 4113 found So, remove the FW_BUG message to avoid confusion since those types are reserved in ACPI 6.4 spec. [1] https://uefi.org/specs/ACPI/6.4/05_ACPI_Software_Programming_Model/ACPI_Software_Programming_Model.html#fpdt-performance-record-types-table [2] https://uefi.org/specs/ACPI/6.4/05_ACPI_Software_Programming_Model/ACPI_Software_Programming_Model.html#runtime-performance-record-types-table Signed-off-by: Adrian Huang <ahuang12@lenovo.com> Acked-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-08-25ACPI: button: Add DMI quirk for Lenovo Yoga 9 (14INTL5)Ulrich Huber
The Lenovo Yoga 9 (14INTL5)'s ACPI _LID is bugged: After hibernation the lid is initially reported as closed. Once closing and then reopening the lid reports the lid as open again. This leads to the conclusion that the initial notification of the lid is missing but subsequent notifications are correct. In order fo the Linux LID code to handle this device properly the lid_init_state must be set to ACPI_BUTTON_LID_INIT_OPEN. Signed-off-by: Ulrich Huber <ulrich@huberulrich.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-08-25ACPI: Add memory semantics to acpi_os_map_memory()Lorenzo Pieralisi
The memory attributes attached to memory regions depend on architecture specific mappings. For some memory regions, the attributes specified by firmware (eg uncached) are not sufficient to determine how a memory region should be mapped by an OS (for instance a region that is define as uncached in firmware can be mapped as Normal or Device memory on arm64) and therefore the OS must be given control on how to map the region to match the expected mapping behaviour (eg if a mapping is requested with memory semantics, it must allow unaligned accesses). Rework acpi_os_map_memory() and acpi_os_ioremap() back-end to split them into two separate code paths: acpi_os_memmap() -> memory semantics acpi_os_ioremap() -> MMIO semantics The split allows the architectural implementation back-ends to detect the default memory attributes required by the mapping in question (ie the mapping API defines the semantics memory vs MMIO) and map the memory accordingly. Link: https://lore.kernel.org/linux-arm-kernel/31ffe8fc-f5ee-2858-26c5-0fd8bdd68702@arm.com Tested-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-08-25Merge branch 'bpf: Add bpf_task_pt_regs() helper'Alexei Starovoitov
Daniel Xu says: ==================== The motivation behind this helper is to access userspace pt_regs in a kprobe handler. uprobe's ctx is the userspace pt_regs. kprobe's ctx is the kernelspace pt_regs. bpf_task_pt_regs() allows accessing userspace pt_regs in a kprobe handler. The final case (kernelspace pt_regs in uprobe) is pretty rare (usermode helper) so I think that can be solved later if necessary. More concretely, this helper is useful in doing BPF-based DWARF stack unwinding. Currently the kernel can only do framepointer based stack unwinds for userspace code. This is because the DWARF state machines are too fragile to be computed in kernelspace [0]. The idea behind DWARF-based stack unwinds w/ BPF is to copy a chunk of the userspace stack (while in prog context) and send it up to userspace for unwinding (probably with libunwind) [1]. This would effectively enable profiling applications with -fomit-frame-pointer using kprobes and uprobes. [0]: https://lkml.org/lkml/2012/2/10/356 [1]: https://github.com/danobi/bpf-dwarf-walk Changes from v1: - Conwolidate BTF_ID decls for task_struct - Enable bpf_get_current_task_btf() for all prog types - Enable bpf_task_pt_regs() for all prog types - Use ASSERT_* macros instead of CHECK ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-08-25bpf: selftests: Add bpf_task_pt_regs() selftestDaniel Xu
This test retrieves the uprobe's pt_regs in two different ways and compares the contents in an arch-agnostic way. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/5581eb8800f6625ec8813fe21e9dce1fbdef4937.1629772842.git.dxu@dxuuu.xyz
2021-08-25bpf: Add bpf_task_pt_regs() helperDaniel Xu
The motivation behind this helper is to access userspace pt_regs in a kprobe handler. uprobe's ctx is the userspace pt_regs. kprobe's ctx is the kernelspace pt_regs. bpf_task_pt_regs() allows accessing userspace pt_regs in a kprobe handler. The final case (kernelspace pt_regs in uprobe) is pretty rare (usermode helper) so I think that can be solved later if necessary. More concretely, this helper is useful in doing BPF-based DWARF stack unwinding. Currently the kernel can only do framepointer based stack unwinds for userspace code. This is because the DWARF state machines are too fragile to be computed in kernelspace [0]. The idea behind DWARF-based stack unwinds w/ BPF is to copy a chunk of the userspace stack (while in prog context) and send it up to userspace for unwinding (probably with libunwind) [1]. This would effectively enable profiling applications with -fomit-frame-pointer using kprobes and uprobes. [0]: https://lkml.org/lkml/2012/2/10/356 [1]: https://github.com/danobi/bpf-dwarf-walk Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/e2718ced2d51ef4268590ab8562962438ab82815.1629772842.git.dxu@dxuuu.xyz
2021-08-25bpf: Extend bpf_base_func_proto helpers with bpf_get_current_task_btf()Daniel Xu
bpf_get_current_task() is already supported so it's natural to also include the _btf() variant for btf-powered helpers. This is required for non-tracing progs to use bpf_task_pt_regs() in the next commit. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/f99870ed5f834c9803d73b3476f8272b1bb987c0.1629772842.git.dxu@dxuuu.xyz
2021-08-25bpf: Consolidate task_struct BTF_ID declarationsDaniel Xu
No need to have it defined 5 times. Once is enough. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/6dcefa5bed26fe1226f26683f36819bb53ec19a2.1629772842.git.dxu@dxuuu.xyz
2021-08-25bpf: Add BTF_ID_LIST_GLOBAL_SINGLE macroDaniel Xu
Same as BTF_ID_LIST_SINGLE macro except defines a global ID. Signed-off-by: Daniel Xu <dxu@dxuuu.xyz> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/a867a97517df42fd3953eeb5454402b57e74538f.1629772842.git.dxu@dxuuu.xyz
2021-08-25pipe: do FASYNC notifications for every pipe IO, not just state changesLinus Torvalds
It turns out that the SIGIO/FASYNC situation is almost exactly the same as the EPOLLET case was: user space really wants to be notified after every operation. Now, in a perfect world it should be sufficient to only notify user space on "state transitions" when the IO state changes (ie when a pipe goes from unreadable to readable, or from unwritable to writable). User space should then do as much as possible - fully emptying the buffer or what not - and we'll notify it again the next time the state changes. But as with EPOLLET, we have at least one case (stress-ng) where the kernel sent SIGIO due to the pipe being marked for asynchronous notification, but the user space signal handler then didn't actually necessarily read it all before returning (it read more than what was written, but since there could be multiple writes, it could leave data pending). The user space code then expected to get another SIGIO for subsequent writes - even though the pipe had been readable the whole time - and would only then read more. This is arguably a user space bug - and Colin King already fixed the stress-ng code in question - but the kernel regression rules are clear: it doesn't matter if kernel people think that user space did something silly and wrong. What matters is that it used to work. So if user space depends on specific historical kernel behavior, it's a regression when that behavior changes. It's on us: we were silly to have that non-optimal historical behavior, and our old kernel behavior was what user space was tested against. Because of how the FASYNC notification was tied to wakeup behavior, this was first broken by commits f467a6a66419 and 1b6b26ae7053 ("pipe: fix and clarify pipe read/write wakeup logic"), but at the time it seems nobody noticed. Probably because the stress-ng problem case ends up being timing-dependent too. It was then unwittingly fixed by commit 3a34b13a88ca ("pipe: make pipe writes always wake up readers") only to be broken again when by commit 3b844826b6c6 ("pipe: avoid unnecessary EPOLLET wakeups under normal loads"). And at that point the kernel test robot noticed the performance refression in the stress-ng.sigio.ops_per_sec case. So the "Fixes" tag below is somewhat ad hoc, but it matches when the issue was noticed. Fix it for good (knock wood) by simply making the kill_fasync() case separate from the wakeup case. FASYNC is quite rare, and we clearly shouldn't even try to use the "avoid unnecessary wakeups" logic for it. Link: https://lore.kernel.org/lkml/20210824151337.GC27667@xsang-OptiPlex-9020/ Fixes: 3b844826b6c6 ("pipe: avoid unnecessary EPOLLET wakeups under normal loads") Reported-by: kernel test robot <oliver.sang@intel.com> Tested-by: Oliver Sang <oliver.sang@intel.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Colin Ian King <colin.king@canonical.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-08-25arm64: mm: fix comment typo of pud_offset_phys()Xujun Leng
Fix a typo in the comment of macro pud_offset_phys(). Signed-off-by: Xujun Leng <lengxujun2007@126.com> Link: https://lore.kernel.org/r/20210825150526.12582-1-lengxujun2007@126.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-08-25Merge branch 'for-v5.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull ucount fixes from Eric Biederman: "This branch fixes a regression that made it impossible to increase rlimits that had been converted to the ucount infrastructure, and also fixes a reference counting bug where the reference was not incremented soon enough. The fixes are trivial and the bugs have been encountered in the wild, and the fixes have been tested" * 'for-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: ucounts: Increase ucounts reference counter before the security hook ucounts: Fix regression preventing increasing of rlimits in init_user_ns
2021-08-25cgroup/cpuset: Avoid memory migration when nodemasks matchNicolas Saenz Julienne
With the introduction of ee9707e8593d ("cgroup/cpuset: Enable memory migration for cpuset v2") attaching a process to a different cgroup will trigger a memory migration regardless of whether it's really needed. Memory migration is an expensive operation, so bypass it if the nodemasks passed to cpuset_migrate_mm() are equal. Note that we're not only avoiding the migration work itself, but also a call to lru_cache_disable(), which triggers and flushes an LRU drain work on every online CPU. Signed-off-by: Nicolas Saenz Julienne <nsaenzju@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2021-08-25arm64: signal32: Drop pointless call to sigdelsetmask()Will Deacon
Commit 77097ae503b1 ("most of set_current_blocked() callers want SIGKILL/SIGSTOP removed from set") extended set_current_blocked() to remove SIGKILL and SIGSTOP from the new signal set and updated all callers accordingly. Unfortunately, this collided with the merge of the arm64 architecture, which duly removes these signals when restoring the compat sigframe, as this was what was previously done by arch/arm/. Remove the redundant call to sigdelsetmask() from compat_restore_sigframe(). Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210825093911.24493-1-will@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-08-25Merge remote-tracking branch 'regulator/for-5.15' into regulator-nextMark Brown
2021-08-25Merge remote-tracking branch 'regulator/for-5.14' into regulator-linusMark Brown
2021-08-25ceph: fix possible null-pointer dereference in ceph_mdsmap_decode()Tuo Li
kcalloc() is called to allocate memory for m->m_info, and if it fails, ceph_mdsmap_destroy() behind the label out_err will be called: ceph_mdsmap_destroy(m); In ceph_mdsmap_destroy(), m->m_info is dereferenced through: kfree(m->m_info[i].export_targets); To fix this possible null-pointer dereference, check m->m_info before the for loop to free m->m_info[i].export_targets. [ jlayton: fix up whitespace damage only kfree(m->m_info) if it's non-NULL ] Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Signed-off-by: Tuo Li <islituo@gmail.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-08-25ceph: correctly handle releasing an embedded cap flushXiubo Li
The ceph_cap_flush structures are usually dynamically allocated, but the ceph_cap_snap has an embedded one. When force umounting, the client will try to remove all the session caps. During this, it will free them, but that should not be done with the ones embedded in a capsnap. Fix this by adding a new boolean that indicates that the cap flush is embedded in a capsnap, and skip freeing it if that's set. At the same time, switch to using list_del_init() when detaching the i_list and g_list heads. It's possible for a forced umount to remove these objects but then handle_cap_flushsnap_ack() races in and does the list_del_init() again, corrupting memory. Cc: stable@vger.kernel.org URL: https://tracker.ceph.com/issues/52283 Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-08-25cachefiles: Use file_inode() rather than accessing ->f_inodeDavid Howells
Use the file_inode() helper rather than accessing ->f_inode directly. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/162431192403.2908479.4590814090994846904.stgit@warthog.procyon.org.uk/
2021-08-25netfs: Move cookie debug ID to struct netfs_cache_resourcesDavid Howells
Move the cookie debug ID from struct netfs_read_request to struct netfs_cache_resources and drop the 'cookie_' prefix. This makes it available for things that want to use netfs_cache_resources without having a netfs_read_request. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/162431190784.2908479.13386972676539789127.stgit@warthog.procyon.org.uk/
2021-08-25fscache: Select netfs stats if fscache stats are enabledDavid Howells
Unconditionally select the stats produced by the netfs lib if fscache stats are enabled as the former are displayed in the latter's procfile. Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-cachefs@redhat.com Reviewed-by: Jeff Layton <jlayton@redhat.com> Link: https://lore.kernel.org/r/162280352566.3319242.10615341893991206961.stgit@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/162431189627.2908479.9165349423842063755.stgit@warthog.procyon.org.uk/
2021-08-25erofs: fix double free of 'copied'Gao Xiang
Dan reported a new smatch warning [1] "fs/erofs/inode.c:210 erofs_read_inode() error: double free of 'copied'" Due to new chunk-based format handling logic, the error path can be called after kfree(copied). Set "copied = NULL" after kfree(copied) to fix this. [1] https://lore.kernel.org/r/202108251030.bELQozR7-lkp@intel.com Link: https://lore.kernel.org/r/20210825120757.11034-1-hsiangkao@linux.alibaba.com Fixes: c5aa903a59db ("erofs: support reading chunk-based uncompressed files") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2021-08-25locking/rtmutex: Dequeue waiter on ww_mutex deadlockThomas Gleixner
The rt_mutex based ww_mutex variant queues the new waiter first in the lock's rbtree before evaluating the ww_mutex specific conditions which might decide that the waiter should back out. This check and conditional exit happens before the waiter is enqueued into the PI chain. The failure handling at the call site assumes that the waiter, if it is the top most waiter on the lock, is queued in the PI chain and then proceeds to adjust the unmodified PI chain, which results in RB tree corruption. Dequeue the waiter from the lock waiter list in the ww_mutex error exit path to prevent this. Fixes: add461325ec5 ("locking/rtmutex: Extend the rtmutex core to support ww_mutex") Reported-by: Sebastian Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210825102454.042280541@linutronix.de
2021-08-25locking/rtmutex: Dont dereference waiter locklessThomas Gleixner
The new rt_mutex_spin_on_onwer() loop checks whether the spinning waiter is still the top waiter on the lock by utilizing rt_mutex_top_waiter(), which is broken because that function contains a sanity check which dereferences the top waiter pointer to check whether the waiter belongs to the lock. That's wrong in the lockless spinwait case: CPU 0 CPU 1 rt_mutex_lock(lock) rt_mutex_lock(lock); queue(waiter0) waiter0 == rt_mutex_top_waiter(lock) rt_mutex_spin_on_onwer(lock, waiter0) { queue(waiter1) waiter1 == rt_mutex_top_waiter(lock) ... top_waiter = rt_mutex_top_waiter(lock) leftmost = rb_first_cached(&lock->waiters); -> signal dequeue(waiter1) destroy(waiter1) w = rb_entry(leftmost, ....) BUG_ON(w->lock != lock) <- UAF The BUG_ON() is correct for the case where the caller holds lock->wait_lock which guarantees that the leftmost waiter entry cannot vanish. For the lockless spinwait case it's broken. Create a new helper function which avoids the pointer dereference and just compares the leftmost entry pointer with current's waiter pointer to validate that currrent is still elegible for spinning. Fixes: 992caf7f1724 ("locking/rtmutex: Add adaptive spinwait mechanism") Reported-by: Sebastian Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210825102453.981720644@linutronix.de
2021-08-25perf/x86/intel/pt: Fix mask of num_address_rangesXiaoyao Li
Per SDM, bit 2:0 of CPUID(0x14,1).EAX[2:0] reports the number of configurable address ranges for filtering, not bit 1:0. Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Link: https://lkml.kernel.org/r/20210824040622.4081502-1-xiaoyao.li@intel.com
2021-08-25regulator: vctrl: Avoid lockdep warning in enable/disable opsChen-Yu Tsai
vctrl_enable() and vctrl_disable() call regulator_enable() and regulator_disable(), respectively. However, vctrl_* are regulator ops and should not be calling the locked regulator APIs. Doing so results in a lockdep warning. Instead of exporting more internal regulator ops, model the ctrl supply as an actual supply to vctrl-regulator. At probe time this driver still needs to use the consumer API to fetch its constraints, but otherwise lets the regulator core handle the upstream supply for it. The enable/disable/is_enabled ops are not removed, but now only track state internally. This preserves the original behavior with the ops being available, but one could argue that the original behavior was already incorrect: the internal state would not match the upstream supply if that supply had another consumer that enabled the supply, while vctrl-regulator was not enabled. The lockdep warning is as follows: WARNING: possible circular locking dependency detected 5.14.0-rc6 #2 Not tainted ------------------------------------------------------ swapper/0/1 is trying to acquire lock: ffffffc011306d00 (regulator_list_mutex){+.+.}-{3:3}, at: regulator_lock_dependent (arch/arm64/include/asm/current.h:19 include/linux/ww_mutex.h:111 drivers/regulator/core.c:329) but task is already holding lock: ffffff8004a77160 (regulator_ww_class_mutex){+.+.}-{3:3}, at: regulator_lock_recursive (drivers/regulator/core.c:156 drivers/regulator/core.c:263) which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (regulator_ww_class_mutex){+.+.}-{3:3}: __mutex_lock_common (include/asm-generic/atomic-instrumented.h:606 include/asm-generic/atomic-long.h:29 kernel/locking/mutex.c:103 kernel/locking/mutex.c:144 kernel/locking/mutex.c:963) ww_mutex_lock (kernel/locking/mutex.c:1199) regulator_lock_recursive (drivers/regulator/core.c:156 drivers/regulator/core.c:263) regulator_lock_dependent (drivers/regulator/core.c:343) regulator_enable (drivers/regulator/core.c:2808) set_machine_constraints (drivers/regulator/core.c:1536) regulator_register (drivers/regulator/core.c:5486) devm_regulator_register (drivers/regulator/devres.c:196) reg_fixed_voltage_probe (drivers/regulator/fixed.c:289) platform_probe (drivers/base/platform.c:1427) [...] -> #1 (regulator_ww_class_acquire){+.+.}-{0:0}: regulator_lock_dependent (include/linux/ww_mutex.h:129 drivers/regulator/core.c:329) regulator_enable (drivers/regulator/core.c:2808) set_machine_constraints (drivers/regulator/core.c:1536) regulator_register (drivers/regulator/core.c:5486) devm_regulator_register (drivers/regulator/devres.c:196) reg_fixed_voltage_probe (drivers/regulator/fixed.c:289) [...] -> #0 (regulator_list_mutex){+.+.}-{3:3}: __lock_acquire (kernel/locking/lockdep.c:3052 (discriminator 4) kernel/locking/lockdep.c:3174 (discriminator 4) kernel/locking/lockdep.c:3789 (discriminator 4) kernel/locking/lockdep.c:5015 (discriminator 4)) lock_acquire (arch/arm64/include/asm/percpu.h:39 kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5627) __mutex_lock_common (include/asm-generic/atomic-instrumented.h:606 include/asm-generic/atomic-long.h:29 kernel/locking/mutex.c:103 kernel/locking/mutex.c:144 kernel/locking/mutex.c:963) mutex_lock_nested (kernel/locking/mutex.c:1125) regulator_lock_dependent (arch/arm64/include/asm/current.h:19 include/linux/ww_mutex.h:111 drivers/regulator/core.c:329) regulator_enable (drivers/regulator/core.c:2808) vctrl_enable (drivers/regulator/vctrl-regulator.c:400) _regulator_do_enable (drivers/regulator/core.c:2617) _regulator_enable (drivers/regulator/core.c:2764) regulator_enable (drivers/regulator/core.c:308 drivers/regulator/core.c:2809) _set_opp (drivers/opp/core.c:819 drivers/opp/core.c:1072) dev_pm_opp_set_rate (drivers/opp/core.c:1164) set_target (drivers/cpufreq/cpufreq-dt.c:62) __cpufreq_driver_target (drivers/cpufreq/cpufreq.c:2216 drivers/cpufreq/cpufreq.c:2271) cpufreq_online (drivers/cpufreq/cpufreq.c:1488 (discriminator 2)) cpufreq_add_dev (drivers/cpufreq/cpufreq.c:1563) subsys_interface_register (drivers/base/bus.c:?) cpufreq_register_driver (drivers/cpufreq/cpufreq.c:2819) dt_cpufreq_probe (drivers/cpufreq/cpufreq-dt.c:344) [...] other info that might help us debug this: Chain exists of: regulator_list_mutex --> regulator_ww_class_acquire --> regulator_ww_class_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(regulator_ww_class_mutex); lock(regulator_ww_class_acquire); lock(regulator_ww_class_mutex); lock(regulator_list_mutex); *** DEADLOCK *** 6 locks held by swapper/0/1: #0: ffffff8002d32188 (&dev->mutex){....}-{3:3}, at: __device_driver_lock (drivers/base/dd.c:1030) #1: ffffffc0111a0520 (cpu_hotplug_lock){++++}-{0:0}, at: cpufreq_register_driver (drivers/cpufreq/cpufreq.c:2792 (discriminator 2)) #2: ffffff8002a8d918 (subsys mutex#9){+.+.}-{3:3}, at: subsys_interface_register (drivers/base/bus.c:1033) #3: ffffff800341bb90 (&policy->rwsem){+.+.}-{3:3}, at: cpufreq_online (include/linux/bitmap.h:285 include/linux/cpumask.h:405 drivers/cpufreq/cpufreq.c:1399) #4: ffffffc011f0b7b8 (regulator_ww_class_acquire){+.+.}-{0:0}, at: regulator_enable (drivers/regulator/core.c:2808) #5: ffffff8004a77160 (regulator_ww_class_mutex){+.+.}-{3:3}, at: regulator_lock_recursive (drivers/regulator/core.c:156 drivers/regulator/core.c:263) stack backtrace: CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc6 #2 7c8f8996d021ed0f65271e6aeebf7999de74a9fa Hardware name: Google Scarlet (DT) Call trace: dump_backtrace (arch/arm64/kernel/stacktrace.c:161) show_stack (arch/arm64/kernel/stacktrace.c:218) dump_stack_lvl (lib/dump_stack.c:106 (discriminator 2)) dump_stack (lib/dump_stack.c:113) print_circular_bug (kernel/locking/lockdep.c:?) check_noncircular (kernel/locking/lockdep.c:?) __lock_acquire (kernel/locking/lockdep.c:3052 (discriminator 4) kernel/locking/lockdep.c:3174 (discriminator 4) kernel/locking/lockdep.c:3789 (discriminator 4) kernel/locking/lockdep.c:5015 (discriminator 4)) lock_acquire (arch/arm64/include/asm/percpu.h:39 kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5627) __mutex_lock_common (include/asm-generic/atomic-instrumented.h:606 include/asm-generic/atomic-long.h:29 kernel/locking/mutex.c:103 kernel/locking/mutex.c:144 kernel/locking/mutex.c:963) mutex_lock_nested (kernel/locking/mutex.c:1125) regulator_lock_dependent (arch/arm64/include/asm/current.h:19 include/linux/ww_mutex.h:111 drivers/regulator/core.c:329) regulator_enable (drivers/regulator/core.c:2808) vctrl_enable (drivers/regulator/vctrl-regulator.c:400) _regulator_do_enable (drivers/regulator/core.c:2617) _regulator_enable (drivers/regulator/core.c:2764) regulator_enable (drivers/regulator/core.c:308 drivers/regulator/core.c:2809) _set_opp (drivers/opp/core.c:819 drivers/opp/core.c:1072) dev_pm_opp_set_rate (drivers/opp/core.c:1164) set_target (drivers/cpufreq/cpufreq-dt.c:62) __cpufreq_driver_target (drivers/cpufreq/cpufreq.c:2216 drivers/cpufreq/cpufreq.c:2271) cpufreq_online (drivers/cpufreq/cpufreq.c:1488 (discriminator 2)) cpufreq_add_dev (drivers/cpufreq/cpufreq.c:1563) subsys_interface_register (drivers/base/bus.c:?) cpufreq_register_driver (drivers/cpufreq/cpufreq.c:2819) dt_cpufreq_probe (drivers/cpufreq/cpufreq-dt.c:344) [...] Reported-by: Brian Norris <briannorris@chromium.org> Fixes: f8702f9e4aa7 ("regulator: core: Use ww_mutex for regulators locking") Fixes: e9153311491d ("regulator: vctrl-regulator: Avoid deadlock getting and setting the voltage") Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Link: https://lore.kernel.org/r/20210825033704.3307263-3-wenst@chromium.org Signed-off-by: Mark Brown <broonie@kernel.org>
2021-08-25regulator: vctrl: Use locked regulator_get_voltage in probe pathChen-Yu Tsai
In commit e9153311491d ("regulator: vctrl-regulator: Avoid deadlock getting and setting the voltage"), all calls to get/set the voltage of the control regulator were switched to unlocked versions to avoid deadlocks. However, the call in the probe path is done without regulator locks held. In this case the locked version should be used. Switch back to the locked regulator_get_voltage() in the probe path to avoid any mishaps. Fixes: e9153311491d ("regulator: vctrl-regulator: Avoid deadlock getting and setting the voltage") Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Link: https://lore.kernel.org/r/20210825033704.3307263-2-wenst@chromium.org Signed-off-by: Mark Brown <broonie@kernel.org>
2021-08-25ASoC: imx-rpmsg: change dev_err to dev_err_probe for -EPROBE_DEFERShengjiu Wang
Change dev_err to dev_err_probe for no need print error message when defer probe happens. Fixes: 39f8405c3e50 ("ASoC: imx-rpmsg: Add machine driver for audio base on rpmsg") Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com> Link: https://lore.kernel.org/r/1629875681-16373-1-git-send-email-shengjiu.wang@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-08-25ASoC: rt5682: Fix the vol+ button detection issueDerek Fang
Fix the wrong button vol+ detection issue with some brand headsets by fine tuning the threshold of button vol+ and SAR ADC button accuracy. Signed-off-by: Derek Fang <derek.fang@realtek.com> Link: https://lore.kernel.org/r/20210825040346.28346-1-derek.fang@realtek.com Signed-off-by: Mark Brown <broonie@kernel.org>
2021-08-25ASoC: Intel: bytcr_rt5640: Make rt5640_jack_gpio/rt5640_jack2_gpio staticPeter Ujfalusi
Marking the two jack gpio as static fixes the following Sparse errors: sound/soc/intel/boards/bytcr_rt5640.c:468:26: error: symbol 'rt5640_jack_gpio' was not declared. Should it be static? sound/soc/intel/boards/bytcr_rt5640.c:475:26: error: symbol 'rt5640_jack2_gpio' was not declared. Should it be static? Fixes: 9ba00856686ad ("ASoC: Intel: bytcr_rt5640: Add support for HP Elite Pad 1000G2 jack-detect") Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Link: https://lore.kernel.org/r/20210825122519.3364-1-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>