summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-08-18arm_pmu: acpi: Add a representative platform device for TRBEAnshuman Khandual
ACPI TRBE does not have a HID for identification which could create and add a platform device into the platform bus. Also without a platform device, it cannot be probed and bound to a platform driver. This creates a dummy platform device for TRBE after ascertaining that ACPI provides required interrupts uniformly across all cpus on the system. This device gets created inside drivers/perf/arm_pmu_acpi.c to accommodate TRBE being built as a module. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20230817055405.249630-3-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-18arm_pmu: acpi: Refactor arm_spe_acpi_register_device()Anshuman Khandual
Sanity checking all the GICC tables for same interrupt number, and ensuring a homogeneous ACPI based machine, could be used for other platform devices as well. Hence this refactors arm_spe_acpi_register_device() into a common helper arm_acpi_register_pmu_device(). Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Co-developed-by: Will Deacon <will@kernel.org> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20230817055405.249630-2-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-18hw_breakpoint: fix single-stepping when using bpf_overflow_handlerTomislav Novak
Arm platforms use is_default_overflow_handler() to determine if the hw_breakpoint code should single-step over the breakpoint trigger or let the custom handler deal with it. Since bpf_overflow_handler() currently isn't recognized as a default handler, attaching a BPF program to a PERF_TYPE_BREAKPOINT event causes it to keep firing (the instruction triggering the data abort exception is never skipped). For example: # bpftrace -e 'watchpoint:0x10000:4:w { print("hit") }' -c ./test Attaching 1 probe... hit hit [...] ^C (./test performs a single 4-byte store to 0x10000) This patch replaces the check with uses_default_overflow_handler(), which accounts for the bpf_overflow_handler() case by also testing if one of the perf_event_output functions gets invoked indirectly, via orig_default_handler. Signed-off-by: Tomislav Novak <tnovak@meta.com> Tested-by: Samuel Gosselin <sgosselin@google.com> # arm64 Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/linux-arm-kernel/20220923203644.2731604-1-tnovak@fb.com/ Link: https://lore.kernel.org/r/20230605191923.1219974-1-tnovak@meta.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-16perf/imx_ddr: don't enable counter0 if none of 4 counters are usedXu Yang
In current driver, counter0 will be enabled after ddr_perf_pmu_enable() is called even though none of the 4 counters are used. This will cause counter0 continue to count until ddr_perf_pmu_disabled() is called. If pmu is not disabled all the time, the pmu interrupt will be asserted from time to time due to counter0 will overflow and irq handler will clear it. It's not an expected behavior. This patch will not enable counter0 if none of 4 counters are used. Fixes: 9a66d36cc7ac ("drivers/perf: imx_ddr: Add DDR performance counter support to perf") Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20230811015438.1999307-2-xu.yang_2@nxp.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-16perf/imx_ddr: speed up overflow frequency of cycleXu Yang
For i.MX8MP, we cannot ensure that cycle counter overflow occurs at least 4 times as often as other events. Due to byte counters will count for any event configured, it will overflow more often. And if byte counters overflow that related counters would stop since they share the COUNTER_CNTL. We can speed up cycle counter overflow frequency by setting counter parameter (CP) field of cycle counter. In this way, we can avoid stop counting byte counters when interrupt didn't come and the byte counters can be fetched or updated from each cycle counter overflow interrupt. Because we initialize CP filed to shorten counter0 overflow time, the cycle counter will start couting from a fixed/base value each time. We need to remove the base from the result too. Therefore, we could get precise result from cycle counter. Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20230811015438.1999307-1-xu.yang_2@nxp.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-16drivers/perf: hisi: Schedule perf session according to localityYicong Yang
The PCIe PMUs locate on different NUMA node but currently we don't consider it and likely stack all the sessions on the same CPU: [root@localhost tmp]# cat /sys/devices/hisi_pcie*/cpumask 0 0 0 0 0 0 This can be optimize a bit to use a local CPU for the PMU. Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20230815131010.2147-1-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-16perf/arm-dmc620: Fix dmc620_pmu_irqs_lock/cpu_hotplug_lock circular lock ↵Waiman Long
dependency The following circular locking dependency was reported when running cpus online/offline test on an arm64 system. [ 84.195923] Chain exists of: dmc620_pmu_irqs_lock --> cpu_hotplug_lock --> cpuhp_state-down [ 84.207305] Possible unsafe locking scenario: [ 84.213212] CPU0 CPU1 [ 84.217729] ---- ---- [ 84.222247] lock(cpuhp_state-down); [ 84.225899] lock(cpu_hotplug_lock); [ 84.232068] lock(cpuhp_state-down); [ 84.238237] lock(dmc620_pmu_irqs_lock); [ 84.242236] *** DEADLOCK *** The following locking order happens when dmc620_pmu_get_irq() calls cpuhp_state_add_instance_nocalls(). lock(dmc620_pmu_irqs_lock) --> lock(cpu_hotplug_lock) On the other hand, the calling sequence cpuhp_thread_fun() => cpuhp_invoke_callback() => dmc620_pmu_cpu_teardown() leads to the locking sequence lock(cpuhp_state-down) => lock(dmc620_pmu_irqs_lock) Here dmc620_pmu_irqs_lock protects both the dmc620_pmu_irqs and the pmus_node lists in various dmc620_pmu instances. dmc620_pmu_get_irq() requires protected access to dmc620_pmu_irqs whereas dmc620_pmu_cpu_teardown() needs protection to the pmus_node lists. Break this circular locking dependency by using two separate locks to protect dmc620_pmu_irqs list and the pmus_node lists respectively. Suggested-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Waiman Long <longman@redhat.com> Link: https://lore.kernel.org/r/20230812235549.494174-1-longman@redhat.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-15perf/smmuv3: Add MODULE_ALIAS for module auto loadingYicong Yang
On my ACPI based arm64 server, if the SMMUv3 PMU is configured as module it won't be loaded automatically after booting even if the device has already been scanned and added. It's because the module lacks a platform alias, the uevent mechanism and userspace tools like udevd make use of this to find the target driver module of the device. This patch adds the missing platform alias of the module, then module will be loaded automatically if device exists. Before this patch: [root@localhost tmp]# modinfo arm_smmuv3_pmu | grep alias alias: of:N*T*Carm,smmu-v3-pmcgC* alias: of:N*T*Carm,smmu-v3-pmcg After this patch: [root@localhost tmp]# modinfo arm_smmuv3_pmu | grep alias alias: platform:arm-smmu-v3-pmcg alias: of:N*T*Carm,smmu-v3-pmcgC* alias: of:N*T*Carm,smmu-v3-pmcg Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20230814131642.65263-1-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-15perf/smmuv3: Enable HiSilicon Erratum 162001900 quirk for HIP08/09Yicong Yang
Some HiSilicon SMMU PMCG suffers the erratum 162001900 that the PMU disable control sometimes fail to disable the counters. This will lead to error or inaccurate data since before we enable the counters the counter's still counting for the event used in last perf session. This patch tries to fix this by hardening the global disable process. Before disable the PMU, writing an invalid event type (0xffff) to focibly stop the counters. Correspondingly restore each events on pmu::pmu_enable(). Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20230814124012.58013-1-yangyicong@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-08-04perf: pmuv3: Remove comments from armv8pmu_[enable|disable]_event()Anshuman Khandual
The comments in armv8pmu_[enable|disable]_event() are blindingly obvious, and does not contribute in making things any better. Let's drop them off. Functional change is not intended. Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20230802090853.1190391-1-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-07-28perf/arm-cmn: Add CMN-700 r3 supportRobin Murphy
CMN-700 r3 has a special configuration option for a so-called "Super Home Node", which is a superset of the standard HN-F that also manages remote-chip coherency for multi-chip setups. As such it has a similar but expanded set of PMU events compared to HN-F, with some additional filtering options to boot. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/49153b72253f6af0e625cb55b9e1b825b110c49c.1688746690.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-07-28perf/arm-cmn: Refactor HN-F event selector macrosRobin Murphy
Refactor the macros for defining HN-F events with additional selectors, so they can be shared with another upcoming similar-but-distinct HN type. No functional change intended. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/0f05327941e06c665dbfd47e03fad29276b9e63c.1688746690.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-07-28perf/arm-cmn: Remove spurious event aliasesRobin Murphy
As the name suggests, the "partial DAT flit" event is only counted for the DAT channel, and furthermore is only applicable to device ports, not mesh links (strictly it's only device ports with CHI-A requesters connected, but detecting that degree of detail is more bother than it's worth). Stop generating spurious event aliases for other combinations which aren't meaningful. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/b01a58e3ff05c322547fbfd015f6dbfedf555ed3.1688746690.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-07-27drivers/perf: Explicitly include correct DT includesRob Herring
The DT of_device.h and of_platform.h date back to the separate of_platform_bus_type before it as merged into the regular platform bus. As part of that merge prepping Arm DT support 13 years ago, they "temporarily" include each other. They also include platform_device.h and of.h. As a result, there's a pretty much random mix of those include files used throughout the tree. In order to detangle these headers and replace the implicit includes with struct declarations, users need to explicitly include the correct includes. Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230714174832.4061752-1-robh@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2023-07-27perf: pmuv3: Add Cortex A520, A715, A720, X3 and X4 PMUsRob Herring
Add support for the Arm Cortex-A520, Cortex-A715, Cortex-A720, Cortex-X3, and Cortex-X4 CPU PMUs. They are straight-forward additions with just new compatible strings. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230706205505.308523-2-robh@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2023-07-27dt-bindings: arm: pmu: Add Cortex A520, A715, A720, X3, and X4Rob Herring
Add compatible strings for the Arm Cortex-A520, Cortex-A715, Cortex-A720, Cortex-X3, and Cortex-X4 CPU PMUs. Acked-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230706205505.308523-1-robh@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2023-07-27perf/smmuv3: Remove build dependency on ACPIVincent Whitchurch
This driver supports working without ACPI since commit 3f7be43561766 ("perf/smmuv3: Add devicetree support"), so remove the build dependency. Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20230706-smmuv3-pmu-noacpi-v1-1-7083ef189158@axis.com Signed-off-by: Will Deacon <will@kernel.org>
2023-07-27perf: xgene_pmu: Convert to devm_platform_ioremap_resource()Yangtao Li
Use devm_platform_ioremap_resource() to simplify code. Signed-off-by: Yangtao Li <frank.li@vivo.com> Link: https://lore.kernel.org/r/20230704093556.17926-1-frank.li@vivo.com Signed-off-by: Will Deacon <will@kernel.org>
2023-07-27driver/perf: Add identifier sysfs file for Yitian 710 DDRJing Zhang
To allow userspace to identify the specific implementation of the device, add an "identifier" sysfs file. The perf tool can match the Yitian 710 DDR metric through the identifier. Signed-off-by: Jing Zhang <renyu.zj@linux.alibaba.com> Acked-by: Ian Rogers <irogers@google.com> Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Link: https://lore.kernel.org/r/1687245156-61215-2-git-send-email-renyu.zj@linux.alibaba.com Signed-off-by: Will Deacon <will@kernel.org>
2023-07-23Linux 6.5-rc3v6.5-rc3Linus Torvalds
2023-07-23Merge tag 'trace-v6.5-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Swapping the ring buffer for snapshotting (for things like irqsoff) can crash if the ring buffer is being resized. Disable swapping when this happens. The missed swap will be reported to the tracer - Report error if the histogram fails to be created due to an error in adding a histogram variable, in event_hist_trigger_parse() - Remove unused declaration of tracing_map_set_field_descr() * tag 'trace-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing/histograms: Return an error if we fail to add histogram to hist_vars list ring-buffer: Do not swap cpu_buffer during resize process tracing: Remove unused extern declaration tracing_map_set_field_descr()
2023-07-23Merge tag 'kbuild-fixes-v6.5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Fix stale help text in gconfig - Support *.S files in compile_commands.json - Flatten KBUILD_CFLAGS - Fix external module builds with Rust so that temporary files are created in the modules directories instead of the kernel tree * tag 'kbuild-fixes-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: rust: avoid creating temporary files kbuild: flatten KBUILD_CFLAGS gen_compile_commands: add assembly files to compilation database kconfig: gconfig: correct program name in help text kconfig: gconfig: drop the Show Debug Info help text
2023-07-24kbuild: rust: avoid creating temporary filesMiguel Ojeda
`rustc` outputs by default the temporary files (i.e. the ones saved by `-Csave-temps`, such as `*.rcgu*` files) in the current working directory when `-o` and `--out-dir` are not given (even if `--emit=x=path` is given, i.e. it does not use those for temporaries). Since out-of-tree modules are compiled from the `linux` tree, `rustc` then tries to create them there, which may not be accessible. Thus pass `--out-dir` explicitly, even if it is just for the temporary files. Similarly, do so for Rust host programs too. Reported-by: Raphael Nestler <raphael.nestler@gmail.com> Closes: https://github.com/Rust-for-Linux/linux/issues/1015 Reported-by: Andrea Righi <andrea.righi@canonical.com> Tested-by: Raphael Nestler <raphael.nestler@gmail.com> # non-hostprogs Tested-by: Andrea Righi <andrea.righi@canonical.com> # non-hostprogs Fixes: 295d8398c67e ("kbuild: specify output names separately for each emission type from rustc") Cc: stable@vger.kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Tested-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-07-23Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "ARM: - Avoid pKVM finalization if KVM initialization fails - Add missing BTI instructions in the hypervisor, fixing an early boot failure on BTI systems - Handle MMU notifiers correctly for non hugepage-aligned memslots - Work around a bug in the architecture where hypervisor timer controls have UNKNOWN behavior under nested virt - Disable preemption in kvm_arch_hardware_enable(), fixing a kernel BUG in cpu hotplug resulting from per-CPU accessor sanity checking - Make WFI emulation on GICv4 systems robust w.r.t. preemption, consistently requesting a doorbell interrupt on vcpu_put() - Uphold RES0 sysreg behavior when emulating older PMU versions - Avoid macro expansion when initializing PMU register names, ensuring the tracepoints pretty-print the sysreg s390: - Two fixes for asynchronous destroy x86 fixes will come early next week" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: s390: pv: fix index value of replaced ASCE KVM: s390: pv: simplify shutdown and fix race KVM: arm64: Fix the name of sys_reg_desc related to PMU KVM: arm64: Correctly handle RES0 bits PMEVTYPER<n>_EL0.evtCount KVM: arm64: vgic-v4: Make the doorbell request robust w.r.t preemption KVM: arm64: Add missing BTI instructions KVM: arm64: Correctly handle page aging notifiers for unaligned memslot KVM: arm64: Disable preemption in kvm_arch_hardware_enable() KVM: arm64: Handle kvm_arm_init failure correctly in finalize_pkvm KVM: arm64: timers: Use CNTHCTL_EL2 when setting non-CNTKCTL_EL1 bits
2023-07-23Merge tag 'ext4_for_linus-6.5-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Bug and regression fixes for 6.5-rc3 for ext4's mballoc and jbd2's checkpoint code" * tag 'ext4_for_linus-6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix rbtree traversal bug in ext4_mb_use_preallocated ext4: fix off by one issue in ext4_mb_choose_next_group_best_avail() ext4: correct inline offset when handling xattrs in inode body jbd2: remove __journal_try_to_free_buffer() jbd2: fix a race when checking checkpoint buffer busy jbd2: Fix wrongly judgement for buffer head removing while doing checkpoint jbd2: remove journal_clean_one_cp_list() jbd2: remove t_checkpoint_io_list jbd2: recheck chechpointing non-dirty buffer
2023-07-23Merge tag '6.5-rc2-smb3-client-fixes-ver2' of ↵Linus Torvalds
git://git.samba.org/sfrench/cifs-2.6 Pull smb client fix from Steve French: "Add minor debugging improvement. The change improves ability to read a network trace to debug problems on encrypted connections which are very common (e.g. using wireshark or tcpdump). That works today with tools like 'smbinfo keys /mnt/file' but requires passing in a filename on the mount (see e.g. [1]), but it often makes more sense to just pass in the mount point path (ie a directory not a filename). So this fix was needed to debug some types of problems (an obvious example is on an encrypted connection failing operations on an empty share or with no files in the root of the directory) - so you can simply pass in the 'smbinfo keys <mntpoint>' and get the information that wireshark needs" Link: https://wiki.samba.org/index.php/Wireshark_Decryption [1] * tag '6.5-rc2-smb3-client-fixes-ver2' of git://git.samba.org/sfrench/cifs-2.6: cifs: update internal module version number for cifs.ko cifs: allow dumping keys for directories too
2023-07-23Merge tag 'kvm-s390-master-6.5-1' of ↵Paolo Bonzini
https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD Two fixes for asynchronous destroy
2023-07-23Merge tag 'kvmarm-fixes-6.5-1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.5, part #1 - Avoid pKVM finalization if KVM initialization fails - Add missing BTI instructions in the hypervisor, fixing an early boot failure on BTI systems - Handle MMU notifiers correctly for non hugepage-aligned memslots - Work around a bug in the architecture where hypervisor timer controls have UNKNOWN behavior under nested virt. - Disable preemption in kvm_arch_hardware_enable(), fixing a kernel BUG in cpu hotplug resulting from per-CPU accessor sanity checking. - Make WFI emulation on GICv4 systems robust w.r.t. preemption, consistently requesting a doorbell interrupt on vcpu_put() - Uphold RES0 sysreg behavior when emulating older PMU versions - Avoid macro expansion when initializing PMU register names, ensuring the tracepoints pretty-print the sysreg.
2023-07-23tracing/histograms: Return an error if we fail to add histogram to hist_vars ↵Mohamed Khalfella
list Commit 6018b585e8c6 ("tracing/histograms: Add histograms to hist_vars if they have referenced variables") added a check to fail histogram creation if save_hist_vars() failed to add histogram to hist_vars list. But the commit failed to set ret to failed return code before jumping to unregister histogram, fix it. Link: https://lore.kernel.org/linux-trace-kernel/20230714203341.51396-1-mkhalfella@purestorage.com Cc: stable@vger.kernel.org Fixes: 6018b585e8c6 ("tracing/histograms: Add histograms to hist_vars if they have referenced variables") Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-07-23ring-buffer: Do not swap cpu_buffer during resize processChen Lin
When ring_buffer_swap_cpu was called during resize process, the cpu buffer was swapped in the middle, resulting in incorrect state. Continuing to run in the wrong state will result in oops. This issue can be easily reproduced using the following two scripts: /tmp # cat test1.sh //#! /bin/sh for i in `seq 0 100000` do echo 2000 > /sys/kernel/debug/tracing/buffer_size_kb sleep 0.5 echo 5000 > /sys/kernel/debug/tracing/buffer_size_kb sleep 0.5 done /tmp # cat test2.sh //#! /bin/sh for i in `seq 0 100000` do echo irqsoff > /sys/kernel/debug/tracing/current_tracer sleep 1 echo nop > /sys/kernel/debug/tracing/current_tracer sleep 1 done /tmp # ./test1.sh & /tmp # ./test2.sh & A typical oops log is as follows, sometimes with other different oops logs. [ 231.711293] WARNING: CPU: 0 PID: 9 at kernel/trace/ring_buffer.c:2026 rb_update_pages+0x378/0x3f8 [ 231.713375] Modules linked in: [ 231.714735] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G W 6.5.0-rc1-00276-g20edcec23f92 #15 [ 231.716750] Hardware name: linux,dummy-virt (DT) [ 231.718152] Workqueue: events update_pages_handler [ 231.719714] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 231.721171] pc : rb_update_pages+0x378/0x3f8 [ 231.722212] lr : rb_update_pages+0x25c/0x3f8 [ 231.723248] sp : ffff800082b9bd50 [ 231.724169] x29: ffff800082b9bd50 x28: ffff8000825f7000 x27: 0000000000000000 [ 231.726102] x26: 0000000000000001 x25: fffffffffffff010 x24: 0000000000000ff0 [ 231.728122] x23: ffff0000c3a0b600 x22: ffff0000c3a0b5c0 x21: fffffffffffffe0a [ 231.730203] x20: ffff0000c3a0b600 x19: ffff0000c0102400 x18: 0000000000000000 [ 231.732329] x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffe7aa8510 [ 231.734212] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000002 [ 231.736291] x11: ffff8000826998a8 x10: ffff800082b9baf0 x9 : ffff800081137558 [ 231.738195] x8 : fffffc00030e82c8 x7 : 0000000000000000 x6 : 0000000000000001 [ 231.740192] x5 : ffff0000ffbafe00 x4 : 0000000000000000 x3 : 0000000000000000 [ 231.742118] x2 : 00000000000006aa x1 : 0000000000000001 x0 : ffff0000c0007208 [ 231.744196] Call trace: [ 231.744892] rb_update_pages+0x378/0x3f8 [ 231.745893] update_pages_handler+0x1c/0x38 [ 231.746893] process_one_work+0x1f0/0x468 [ 231.747852] worker_thread+0x54/0x410 [ 231.748737] kthread+0x124/0x138 [ 231.749549] ret_from_fork+0x10/0x20 [ 231.750434] ---[ end trace 0000000000000000 ]--- [ 233.720486] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 233.721696] Mem abort info: [ 233.721935] ESR = 0x0000000096000004 [ 233.722283] EC = 0x25: DABT (current EL), IL = 32 bits [ 233.722596] SET = 0, FnV = 0 [ 233.722805] EA = 0, S1PTW = 0 [ 233.723026] FSC = 0x04: level 0 translation fault [ 233.723458] Data abort info: [ 233.723734] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 233.724176] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 233.724589] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 233.725075] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000104943000 [ 233.725592] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 [ 233.726231] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [ 233.726720] Modules linked in: [ 233.727007] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G W 6.5.0-rc1-00276-g20edcec23f92 #15 [ 233.727777] Hardware name: linux,dummy-virt (DT) [ 233.728225] Workqueue: events update_pages_handler [ 233.728655] pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 233.729054] pc : rb_update_pages+0x1a8/0x3f8 [ 233.729334] lr : rb_update_pages+0x154/0x3f8 [ 233.729592] sp : ffff800082b9bd50 [ 233.729792] x29: ffff800082b9bd50 x28: ffff8000825f7000 x27: 0000000000000000 [ 233.730220] x26: 0000000000000000 x25: ffff800082a8b840 x24: ffff0000c0102418 [ 233.730653] x23: 0000000000000000 x22: fffffc000304c880 x21: 0000000000000003 [ 233.731105] x20: 00000000000001f4 x19: ffff0000c0102400 x18: ffff800082fcbc58 [ 233.731727] x17: 0000000000000000 x16: 0000000000000001 x15: 0000000000000001 [ 233.732282] x14: ffff8000825fe0c8 x13: 0000000000000001 x12: 0000000000000000 [ 233.732709] x11: ffff8000826998a8 x10: 0000000000000ae0 x9 : ffff8000801b760c [ 233.733148] x8 : fefefefefefefeff x7 : 0000000000000018 x6 : ffff0000c03298c0 [ 233.733553] x5 : 0000000000000002 x4 : 0000000000000000 x3 : 0000000000000000 [ 233.733972] x2 : ffff0000c3a0b600 x1 : 0000000000000000 x0 : 0000000000000000 [ 233.734418] Call trace: [ 233.734593] rb_update_pages+0x1a8/0x3f8 [ 233.734853] update_pages_handler+0x1c/0x38 [ 233.735148] process_one_work+0x1f0/0x468 [ 233.735525] worker_thread+0x54/0x410 [ 233.735852] kthread+0x124/0x138 [ 233.736064] ret_from_fork+0x10/0x20 [ 233.736387] Code: 92400000 910006b5 aa000021 aa0303f7 (f9400060) [ 233.736959] ---[ end trace 0000000000000000 ]--- After analysis, the seq of the error is as follows [1-5]: int ring_buffer_resize(struct trace_buffer *buffer, unsigned long size, int cpu_id) { for_each_buffer_cpu(buffer, cpu) { cpu_buffer = buffer->buffers[cpu]; //1. get cpu_buffer, aka cpu_buffer(A) ... ... schedule_work_on(cpu, &cpu_buffer->update_pages_work); //2. 'update_pages_work' is queue on 'cpu', cpu_buffer(A) is passed to // update_pages_handler, do the update process, set 'update_done' in // complete(&cpu_buffer->update_done) and to wakeup resize process. //----> //3. Just at this moment, ring_buffer_swap_cpu is triggered, //cpu_buffer(A) be swaped to cpu_buffer(B), the max_buffer. //ring_buffer_swap_cpu is called as the 'Call trace' below. Call trace: dump_backtrace+0x0/0x2f8 show_stack+0x18/0x28 dump_stack+0x12c/0x188 ring_buffer_swap_cpu+0x2f8/0x328 update_max_tr_single+0x180/0x210 check_critical_timing+0x2b4/0x2c8 tracer_hardirqs_on+0x1c0/0x200 trace_hardirqs_on+0xec/0x378 el0_svc_common+0x64/0x260 do_el0_svc+0x90/0xf8 el0_svc+0x20/0x30 el0_sync_handler+0xb0/0xb8 el0_sync+0x180/0x1c0 //<---- /* wait for all the updates to complete */ for_each_buffer_cpu(buffer, cpu) { cpu_buffer = buffer->buffers[cpu]; //4. get cpu_buffer, cpu_buffer(B) is used in the following process, //the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong. //for example, cpu_buffer(A)->update_done will leave be set 1, and will //not 'wait_for_completion' at the next resize round. if (!cpu_buffer->nr_pages_to_update) continue; if (cpu_online(cpu)) wait_for_completion(&cpu_buffer->update_done); cpu_buffer->nr_pages_to_update = 0; } ... } //5. the state of cpu_buffer(A) and cpu_buffer(B) is totally wrong, //Continuing to run in the wrong state, then oops occurs. Link: https://lore.kernel.org/linux-trace-kernel/202307191558478409990@zte.com.cn Signed-off-by: Chen Lin <chen.lin5@zte.com.cn> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-07-23tracing: Remove unused extern declaration tracing_map_set_field_descr()YueHaibing
Since commit 08d43a5fa063 ("tracing: Add lock-free tracing_map"), this is never used, so can be removed. Link: https://lore.kernel.org/linux-trace-kernel/20230722032123.24664-1-yuehaibing@huawei.com Cc: <mhiramat@kernel.org> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-07-23kbuild: flatten KBUILD_CFLAGSAlexey Dobriyan
Make it slightly easier to see which compiler options are added and removed (and not worry about column limit too!). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reviewed-by: Nicolas Schier <n.schier@avm.de> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-07-23gen_compile_commands: add assembly files to compilation databaseBenjamin Gray
Like C source files, tooling can find it useful to have the assembly source file compilation recorded. The .S extension appears to used across all architectures. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Reviewed-by: Fangrui Song <maskray@google.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-07-23ext4: fix rbtree traversal bug in ext4_mb_use_preallocatedOjaswin Mujoo
During allocations, while looking for preallocations(PA) in the per inode rbtree, we can't do a direct traversal of the tree because ext4_mb_discard_group_preallocation() can paralelly mark the pa deleted and that can cause direct traversal to skip some entries. This was leading to a BUG_ON() being hit [1] when we missed a PA that could satisfy our request and ultimately tried to create a new PA that would overlap with the missed one. To makes sure we handle that case while still keeping the performance of the rbtree, we make use of the fact that the only pa that could possibly overlap the original goal start is the one that satisfies the below conditions: 1. It must have it's logical start immediately to the left of (ie less than) original logical start. 2. It must not be deleted To find this pa we use the following traversal method: 1. Descend into the rbtree normally to find the immediate neighboring PA. Here we keep descending irrespective of if the PA is deleted or if it overlaps with our request etc. The goal is to find an immediately adjacent PA. 2. If the found PA is on right of original goal, use rb_prev() to find the left adjacent PA. 3. Check if this PA is deleted and keep moving left with rb_prev() until a non deleted PA is found. 4. This is the PA we are looking for. Now we can check if it can satisfy the original request and proceed accordingly. This approach also takes care of having deleted PAs in the tree. (While we are at it, also fix a possible overflow bug in calculating the end of a PA) [1] https://lore.kernel.org/linux-ext4/CA+G9fYv2FRpLqBZf34ZinR8bU2_ZRAUOjKAD3+tKRFaEQHtt8Q@mail.gmail.com/ Cc: stable@kernel.org # 6.4 Fixes: 3872778664e3 ("ext4: Use rbtrees to manage PAs instead of inode i_prealloc_list") Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Reviewed-by: Ritesh Harjani (IBM) ritesh.list@gmail.com Tested-by: Ritesh Harjani (IBM) ritesh.list@gmail.com Link: https://lore.kernel.org/r/edd2efda6a83e6343c5ace9deea44813e71dbe20.1690045963.git.ojaswin@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-07-23ext4: fix off by one issue in ext4_mb_choose_next_group_best_avail()Ojaswin Mujoo
In ext4_mb_choose_next_group_best_avail(), we want the start order to be 1 less than goal length and the min_order to be, at max, 1 more than the original length. This commit fixes an off by one issue that arose due to the fact that 1 << fls(n) > (n). After all the processing: order = 1 order below goal len min_order = maximum of the three:- - order - trim_order - 1 order below B2C(s_stripe) - 1 order above original len Cc: stable@kernel.org Fixes: 33122aa930 ("ext4: Add allocation criteria 1.5 (CR1_5)") Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://lore.kernel.org/r/20230609103403.112807-1-ojaswin@linux.ibm.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-07-23ext4: correct inline offset when handling xattrs in inode bodyEric Whitney
When run on a file system where the inline_data feature has been enabled, xfstests generic/269, generic/270, and generic/476 cause ext4 to emit error messages indicating that inline directory entries are corrupted. This occurs because the inline offset used to locate inline directory entries in the inode body is not updated when an xattr in that shared region is deleted and the region is shifted in memory to recover the space it occupied. If the deleted xattr precedes the system.data attribute, which points to the inline directory entries, that attribute will be moved further up in the region. The inline offset continues to point to whatever is located in system.data's former location, with unfortunate effects when used to access directory entries or (presumably) inline data in the inode body. Cc: stable@kernel.org Signed-off-by: Eric Whitney <enwlinux@gmail.com> Link: https://lore.kernel.org/r/20230522181520.1570360-1-enwlinux@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-07-22Merge tag 'powerpc-6.5-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Reinstate support for little endian ELFv1 binaries, which it turns out still exist in the wild. - Revert a change which used asm goto for WARN_ON/__WARN_FLAGS, as it lead to dead code generation and seemed to trigger compiler bugs in some edge cases. - Fix a deadlock in the pseries VAS code, between live migration and the driver's mmap handler. - Disable KCOV instrumentation in the powerpc KASAN code. Thanks to Andrew Donnellan, Benjamin Gray, Christophe Leroy, Haren Myneni, Russell Currey, and Uwe Kleine-König. * tag 'powerpc-6.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: Revert "powerpc/64s: Remove support for ELFv1 little endian userspace" powerpc/kasan: Disable KCOV in KASAN code powerpc/512x: lpbfifo: Convert to platform remove callback returning void powerpc/crypto: Add gitignore for generated P10 AES/GCM .S files Revert "powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with asm goto" powerpc/pseries/vas: Hold mmap_mutex after mmap lock during window close
2023-07-22cifs: update internal module version number for cifs.koSteve French
From 2.43 to 2.44 Signed-off-by: Steve French <stfrench@microsoft.com>
2023-07-22cifs: allow dumping keys for directories tooShyam Prasad N
Dumping the enc/dec keys is a session wide operation. And it should not matter if the ioctl was run on a regular file or a directory. Currently, we obtain the tcon pointer from the cifs file handle. But since there's no dir open call in cifs, this is not populated for dirs. This change allows dumping of session keys using ioctl even for directories. To do this, we'll now get the tcon pointer from the superblock, and not from the file handle. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-07-22Merge tag 's390-6.5-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Heiko Carstens: - Fix per vma lock fault handling: add missing !(fault & VM_FAULT_ERROR) check to fault handler to prevent error handling for return values that don't indicate an error - Use kfree_sensitive() instead of kfree() in paes crypto code to clear memory that may contain keys before freeing it - Fix reply buffer size calculation for CCA replies in zcrypt device driver * tag 's390-6.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/zcrypt: fix reply buffer calculations for CCA replies s390/crypto: use kfree_sensitive() instead of kfree() s390/mm: fix per vma lock fault handling
2023-07-22Merge tag 'block-6.5-2023-07-21' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: - Fix for loop regressions (Mauricio) - Fix a potential stall with batched wakeups in sbitmap (David) - Fix for stall with recursive plug flushes (Ross) - Skip accounting of empty requests for blk-iocost (Chengming) - Remove a dead field in struct blk_mq_hw_ctx (Chengming) * tag 'block-6.5-2023-07-21' of git://git.kernel.dk/linux: loop: do not enforce max_loop hard limit by (new) default loop: deprecate autoloading callback loop_probe() sbitmap: fix batching wakeup blk-iocost: skip empty flush bio in iocost blk-mq: delete dead struct blk_mq_hw_ctx->queued field blk-mq: Fix stall due to recursive flush plug
2023-07-22Merge tag 'io_uring-6.5-2023-07-21' of git://git.kernel.dk/linuxLinus Torvalds
Pull io_uring fixes from Jens Axboe: - Fix for io-wq not always honoring REQ_F_NOWAIT, if it was set and punted directly (eg via DRAIN) (me) - Capability check fix (Ondrej) - Regression fix for the mmap changes that went into 6.4, which apparently broke IA64 (Helge) * tag 'io_uring-6.5-2023-07-21' of git://git.kernel.dk/linux: ia64: mmap: Consider pgoff when searching for free mapping io_uring: Fix io_uring mmap() by using architecture-provided get_unmapped_area() io_uring: treat -EAGAIN for REQ_F_NOWAIT as final for io-wq io_uring: don't audit the capability check in io_uring_create()
2023-07-22Merge tag 'devicetree-fixes-for-6.5-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fixes from Rob Herring: - Fix moortec,mr75203 schema usage of 'multipleOf' keyword - Fix regression in systems depending on "of-display" device name - Build fix for s390 with CONFIG_PCI=n and OF_EARLY_FLATTREE=y - Drop two obsolete serial .txt bindings * tag 'devicetree-fixes-for-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: dt-bindings: serial: Remove obsolete nxp,lpc1850-uart.txt dt-bindings: serial: Remove obsolete cavium-uart.txt dt-bindings: hwmon: moortec,mr75203: fix multipleOf for coefficients of: Preserve "of-display" device name for compatibility of: make OF_EARLY_FLATTREE depend on HAS_IOMEM
2023-07-22Merge tag 'regmap-fix-v6.5-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull regmap fixes from Mark Brown: "Three fixes here: - The issues with accounting for register and padding length on raw buses turn out to be quite widespread in custom buses. In order to avoid disturbing anything drop the initial fixes and fall back to a point fix in the SMBus code where the issue was originally noticed, a more substantial refactoring of the API which ensures that all buses make the same assumptions will follow. - The generic regcache code had been forcing on async I/O which did not work with the new maple tree sync code when used with SPI. Since that was mainly for the rbtree cache and the assumptions about hardware that drove the choice are probably not true any more fix this by pushing the enablement of async down into the rbtree code. This probably also makes cache syncs for systems faster though it's not the point. - The test code was triggering use of the rbtree and maple tree caches with dynamic allocation of nodes since all the testing is with RAM backed caches with no I/O performance issues. Just disable the locking in the tests to avoid triggering warnings when allocation debugging is turned on, it's not really what's being tested" * tag 'regmap-fix-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: regmap: Disable locking for RBTREE and MAPLE unit tests regcache: Push async I/O request down into the rbtree cache regmap: Account for register length in SMBus I/O limits regmap: Drop initial version of maximum transfer length fixes
2023-07-22Merge tag 'gpio-fixes-for-v6.5-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fixes from Bartosz Golaszewski: - fix initial value handling for output-only pins in gpio-tps68470 - fix two resource leaks in gpio-mvebu * tag 'gpio-fixes-for-v6.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpio: mvebu: fix irq domain leak gpio: mvebu: Make use of devm_pwmchip_add gpio: tps68470: Make tps68470_gpio_output() always set the initial value
2023-07-21dt-bindings: serial: Remove obsolete nxp,lpc1850-uart.txtRob Herring
nxp,lpc1850-uart.txt binding is already covered by 8250.yaml, so remove it. Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230707221607.1064888-1-robh@kernel.org Signed-off-by: Rob Herring <robh@kernel.org>
2023-07-21dt-bindings: serial: Remove obsolete cavium-uart.txtRob Herring
cavium-uart.txt binding is already covered by 8250.yaml, so remove it. Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230707221602.1063972-1-robh@kernel.org Signed-off-by: Rob Herring <robh@kernel.org>
2023-07-21loop: do not enforce max_loop hard limit by (new) defaultMauricio Faria de Oliveira
Problem: The max_loop parameter is used for 2 different purposes: 1) initial number of loop devices to pre-create on init 2) maximum number of loop devices to add on access/open() Historically, its default value (zero) caused 1) to create non-zero number of devices (CONFIG_BLK_DEV_LOOP_MIN_COUNT), and no hard limit on 2) to add devices with autoloading. However, the default value changed in commit 85c50197716c ("loop: Fix the max_loop commandline argument treatment when it is set to 0") to CONFIG_BLK_DEV_LOOP_MIN_COUNT, for max_loop=0 not to pre-create devices. That does improve 1), but unfortunately it breaks 2), as the default behavior changed from no-limit to hard-limit. Example: For example, this userspace code broke for N >= CONFIG, if the user relied on the default value 0 for max_loop: mknod("/dev/loopN"); open("/dev/loopN"); // now fails with ENXIO Though affected users may "fix" it with (loop.)max_loop=0, this means to require a kernel parameter change on stable kernel update (that commit Fixes: an old commit in stable). Solution: The original semantics for the default value in 2) can be applied if the parameter is not set (ie, default behavior). This still keeps the intended function in 1) and 2) if set, and that commit's intended improvement in 1) if max_loop=0. Before 85c50197716c: - default: 1) CONFIG devices 2) no limit - max_loop=0: 1) CONFIG devices 2) no limit - max_loop=X: 1) X devices 2) X limit After 85c50197716c: - default: 1) CONFIG devices 2) CONFIG limit (*) - max_loop=0: 1) 0 devices (*) 2) no limit - max_loop=X: 1) X devices 2) X limit This commit: - default: 1) CONFIG devices 2) no limit (*) - max_loop=0: 1) 0 devices 2) no limit - max_loop=X: 1) X devices 2) X limit Future: The issue/regression from that commit only affects code under the CONFIG_BLOCK_LEGACY_AUTOLOAD deprecation guard, thus the fix too is contained under it. Once that deprecated functionality/code is removed, the purpose 2) of max_loop (hard limit) is no longer in use, so the module parameter description can be changed then. Tests: Linux 6.4-rc7 CONFIG_BLK_DEV_LOOP_MIN_COUNT=8 CONFIG_BLOCK_LEGACY_AUTOLOAD=y - default (original) # ls -1 /dev/loop* /dev/loop-control /dev/loop0 ... /dev/loop7 # ./test-loop open: /dev/loop8: No such device or address - default (patched) # ls -1 /dev/loop* /dev/loop-control /dev/loop0 ... /dev/loop7 # ./test-loop # - max_loop=0 (original & patched): # ls -1 /dev/loop* /dev/loop-control # ./test-loop # - max_loop=8 (original & patched): # ls -1 /dev/loop* /dev/loop-control /dev/loop0 ... /dev/loop7 # ./test-loop open: /dev/loop8: No such device or address - max_loop=0 (patched; CONFIG_BLOCK_LEGACY_AUTOLOAD is not set) # ls -1 /dev/loop* /dev/loop-control # ./test-loop open: /dev/loop8: No such device or address Fixes: 85c50197716c ("loop: Fix the max_loop commandline argument treatment when it is set to 0") Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230720143033.841001-3-mfo@canonical.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-07-21loop: deprecate autoloading callback loop_probe()Mauricio Faria de Oliveira
The 'probe' callback in __register_blkdev() is only used under the CONFIG_BLOCK_LEGACY_AUTOLOAD deprecation guard. The loop_probe() function is only used for that callback, so guard it too, accordingly. See commit fbdee71bb5d8 ("block: deprecate autoloading based on dev_t"). Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230720143033.841001-2-mfo@canonical.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-07-21sbitmap: fix batching wakeupDavid Jeffery
Current code supposes that it is enough to provide forward progress by just waking up one wait queue after one completion batch is done. Unfortunately this way isn't enough, cause waiter can be added to wait queue just after it is woken up. Follows one example(64 depth, wake_batch is 8) 1) all 64 tags are active 2) in each wait queue, there is only one single waiter 3) each time one completion batch(8 completions) wakes up just one waiter in each wait queue, then immediately one new sleeper is added to this wait queue 4) after 64 completions, 8 waiters are wakeup, and there are still 8 waiters in each wait queue 5) after another 8 active tags are completed, only one waiter can be wakeup, and the other 7 can't be waken up anymore. Turns out it isn't easy to fix this problem, so simply wakeup enough waiters for single batch. Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Chengming Zhou <zhouchengming@bytedance.com> Cc: Jan Kara <jack@suse.cz> Signed-off-by: David Jeffery <djeffery@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20230721095715.232728-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>