linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2022-09-07	dyndbg: gather __dyndbg[] state into struct _ddebug_info	Jim Cromie
	This new struct composes the linker provided (vector,len) section, and provides a place to add other __dyndbg[] state-data later: descs - the vector of descriptors in __dyndbg section. num_descs - length of the data/section. Use it, in several different ways, as follows: In lib/dynamic_debug.c: ddebug_add_module(): Alter params-list, replacing 2 args (array,index) with a struct _ddebug_info * containing them both, with room for expansion. This helps future-proof the function prototype against the looming addition of class-map info into the dyndbg-state, by providing a place to add more member fields later. NB: later add static struct _ddebug_info builtins_state declaration, not needed yet. ddebug_add_module() is called in 2 contexts: In dynamic_debug_init(), declare, init a struct _ddebug_info di auto-var to use as a cursor. Then iterate over the prdbg blocks of the builtin modules, and update the di cursor before calling _add_module for each. Its called from kernel/module/main.c:load_info() for each loaded module: In internal.h, alter struct load_info, replacing the dyndbg array,len fields with an embedded _ddebug_info containing them both; and populate its members in find_module_sections(). The 2 calling contexts differ in that _init deals with contiguous subranges of __dyndbgs[] section, packed together, while loadable modules are added one at a time. So rename ddebug_add_module() into outer/__inner fns, call __inner from _init, and provide the offset into the builtin __dyndbgs[] where the module's prdbgs reside. The cursor provides start, len of the subrange for each. The offset will be used later to pack the results of builtin __dyndbg_sites[] de-duplication, and is 0 and unneeded for loadable modules, Note: kernel/module/main.c includes <dynamic_debug.h> for struct _ddeubg_info. This might be prone to include loops, since its also included by printk.h. Nothing has broken in robot-land on this. cc: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20220904214134.408619-12-jim.cromie@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-07	dyndbg: cleanup auto vars in dynamic_debug_init	Jim Cromie
	rework var-names for clarity, regularity rename variables - n to mod_sites - it counts sites-per-module - entries to i - display only - iter_start to iter_mod_start - marks start of each module's subrange - modct to mod_ct - stylistic new iterator var: - site - cursor parallel to iter 1st step towards 'demotion' of iter->site, for removal later treat vars as iters: - drop init at top init just above for-loop, in a textual block Acked-by: Jason Baron <jbaron@akamai.com> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20220904214134.408619-11-jim.cromie@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-07	net/smc: Fix possible access to freed memory in link clear	Yacan Liu
	After modifying the QP to the Error state, all RX WR would be completed with WC in IB_WC_WR_FLUSH_ERR status. Current implementation does not wait for it is done, but destroy the QP and free the link group directly. So there is a risk that accessing the freed memory in tasklet context. Here is a crash example: BUG: unable to handle page fault for address: ffffffff8f220860 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD f7300e067 P4D f7300e067 PUD f7300f063 PMD 8c4e45063 PTE 800ffff08c9df060 Oops: 0002 [#1] SMP PTI CPU: 1 PID: 0 Comm: swapper/1 Kdump: loaded Tainted: G S OE 5.10.0-0607+ #23 Hardware name: Inspur NF5280M4/YZMB-00689-101, BIOS 4.1.20 07/09/2018 RIP: 0010:native_queued_spin_lock_slowpath+0x176/0x1b0 Code: f3 90 48 8b 32 48 85 f6 74 f6 eb d5 c1 ee 12 83 e0 03 83 ee 01 48 c1 e0 05 48 63 f6 48 05 00 c8 02 00 48 03 04 f5 00 09 98 8e <48> 89 10 8b 42 08 85 c0 75 09 f3 90 8b 42 08 85 c0 74 f7 48 8b 32 RSP: 0018:ffffb3b6c001ebd8 EFLAGS: 00010086 RAX: ffffffff8f220860 RBX: 0000000000000246 RCX: 0000000000080000 RDX: ffff91db1f86c800 RSI: 000000000000173c RDI: ffff91db62bace00 RBP: ffff91db62bacc00 R08: 0000000000000000 R09: c00000010000028b R10: 0000000000055198 R11: ffffb3b6c001ea58 R12: ffff91db80e05010 R13: 000000000000000a R14: 0000000000000006 R15: 0000000000000040 FS: 0000000000000000(0000) GS:ffff91db1f840000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffff8f220860 CR3: 00000001f9580004 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> _raw_spin_lock_irqsave+0x30/0x40 mlx5_ib_poll_cq+0x4c/0xc50 [mlx5_ib] smc_wr_rx_tasklet_fn+0x56/0xa0 [smc] tasklet_action_common.isra.21+0x66/0x100 __do_softirq+0xd5/0x29c asm_call_irq_on_stack+0x12/0x20 </IRQ> do_softirq_own_stack+0x37/0x40 irq_exit_rcu+0x9d/0xa0 sysvec_call_function_single+0x34/0x80 asm_sysvec_call_function_single+0x12/0x20 Fixes: bd4ad57718cc ("smc: initialize IB transport incl. PD, MR, QP, CQ, event, WR") Signed-off-by: Yacan Liu <liuyacan@corp.netease.com> Reviewed-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-07	dyndbg: drop EXPORTed dynamic_debug_exec_queries	Jim Cromie
	This exported fn is unused, and will not be needed. Lets dump it. The export was added to let drm control pr_debugs, as part of using them to avoid drm_debug_enabled overheads. But its better to just implement the drm.debug bitmap interface, then its available for everyone. Fixes: a2d375eda771 ("dyndbg: refine export, rename to dynamic_debug_exec_queries()") Fixes: 4c0d77828d4f ("dyndbg: export ddebug_exec_queries") Acked-by: Jason Baron <jbaron@akamai.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20220904214134.408619-10-jim.cromie@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-07	dyndbg: add test_dynamic_debug module	Jim Cromie
	Provide a simple module to allow testing DYNAMIC_DEBUG behavior. It calls do_prints() from module-init, and with a sysfs-node. dmesg -C dmesg -w & modprobe test_dynamic_debug dyndbg=+p echo 1 > /sys/module/dynamic_debug/parameters/verbose cat /sys/module/test_dynamic_debug/parameters/do_prints echo module test_dynamic_debug +mftl > /proc/dynamic_debug/control echo junk > /sys/module/test_dynamic_debug/parameters/do_prints Acked-by: Jason Baron <jbaron@akamai.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20220904214134.408619-9-jim.cromie@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-07	dyndbg: let query-modname override actual module name	Jim Cromie
	dyndbg's control-parser: ddebug_parse_query(), requires that search terms: module, func, file, lineno, are used only once in a query; a thing cannot be named both foo and bar. The cited commit added an overriding module modname, taken from the module loader, which is authoritative. So it set query.module 1st, which disallowed its use in the query-string. But now, its useful to allow a module-load to enable classes across a whole (or part of) a subsystem at once. # enable (dynamic-debug in) drm only modprobe drm dyndbg="class DRM_UT_CORE +p" # get drm_helper too modprobe drm dyndbg="class DRM_UT_CORE module drm* +p" # get everything that knows DRM_UT_CORE modprobe drm dyndbg="class DRM_UT_CORE module * +p" # also for boot-args: drm.dyndbg="class DRM_UT_CORE module * +p" So convert the override into a default, by filling it only when/after the query-string omitted the module. NB: the query class FOO handling is forthcoming. Fixes: 8e59b5cfb9a6 dynamic_debug: add modname arg to exec_query callchain Acked-by: Jason Baron <jbaron@akamai.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20220904214134.408619-8-jim.cromie@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-07	dyndbg: use ESCAPE_SPACE for cat control	Jim Cromie
	`cat control` currently does octal escape, so '\n' becomes "\012". Change this to display as "\n" instead, which reads much cleaner. :#> head -n7 /proc/dynamic_debug/control # filename:lineno [module]function flags format init/main.c:1179 [main]initcall_blacklist =_ "blacklisting initcall %s\n" init/main.c:1218 [main]initcall_blacklisted =_ "initcall %s blacklisted\n" init/main.c:1424 [main]run_init_process =_ " with arguments:\n" init/main.c:1426 [main]run_init_process =_ " %s\n" init/main.c:1427 [main]run_init_process =_ " with environment:\n" init/main.c:1429 [main]run_init_process =_ " %s\n" Acked-by: Jason Baron <jbaron@akamai.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20220904214134.408619-7-jim.cromie@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-07	dyndbg: reverse module.callsite walk in cat control	Jim Cromie
	Walk the module's vector of callsites backwards; ie N..0. This "corrects" the backwards appearance of a module's prdbg vector when walked 0..N. I think this is due to linker mechanics, which I'm inclined to treat as immutable, and the order is fixable in display. No functional changes. Combined with previous commit, which reversed tables-list, we get: :#> head -n7 /proc/dynamic_debug/control # filename:lineno [module]function flags format init/main.c:1179 [main]initcall_blacklist =_ "blacklisting initcall %s\012" init/main.c:1218 [main]initcall_blacklisted =_ "initcall %s blacklisted\012" init/main.c:1424 [main]run_init_process =_ " with arguments:\012" init/main.c:1426 [main]run_init_process =_ " %s\012" init/main.c:1427 [main]run_init_process =_ " with environment:\012" init/main.c:1429 [main]run_init_process =_ " %s\012" Acked-by: Jason Baron <jbaron@akamai.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20220904214134.408619-6-jim.cromie@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-07	dyndbg: reverse module walk in cat control	Jim Cromie
	/proc/dynamic_debug/control walks the prdbg catalog in "reverse", fix this by adding new ddebug_tables to tail of list. This puts init/main.c entries 1st, which looks more than coincidental. no functional changes. Acked-by: Jason Baron <jbaron@akamai.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20220904214134.408619-5-jim.cromie@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-07	dyndbg: show both old and new in change-info	Jim Cromie
	print "old => new" flag values to the info("change") message. no functional change. Acked-by: Jason Baron <jbaron@akamai.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20220904214134.408619-4-jim.cromie@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-07	dyndbg: fix module.dyndbg handling	Jim Cromie
	For CONFIG_DYNAMIC_DEBUG=N, the ddebug_dyndbg_module_param_cb() stub-fn is too permissive: bash-5.1# modprobe drm JUNKdyndbg bash-5.1# modprobe drm dyndbgJUNK [ 42.933220] dyndbg param is supported only in CONFIG_DYNAMIC_DEBUG builds [ 42.937484] ACPI: bus type drm_connector registered This caused no ill effects, because unknown parameters are either ignored by default with an "unknown parameter" warning, or ignored because dyndbg allows its no-effect use on non-dyndbg builds. But since the code has an explicit feedback message, it should be issued accurately. Fix with strcmp for exact param-name match. Fixes: b48420c1d301 dynamic_debug: make dynamic-debug work for module initialization Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: Jason Baron <jbaron@akamai.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20220904214134.408619-3-jim.cromie@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-07	dyndbg: fix static_branch manipulation	Jim Cromie
	In https://lore.kernel.org/lkml/20211209150910.GA23668@axis.com/ Vincent's patch commented on, and worked around, a bug toggling static_branch's, when a 2nd PRINTK-ish flag was added. The bug results in a premature static_branch_disable when the 1st of 2 flags was disabled. The cited commit computed newflags, but then in the JUMP_LABEL block, failed to use that result, instead using just one of the terms in it. Using newflags instead made the code work properly. This is Vincents test-case, reduced. It needs the 2nd flag to demonstrate the bug, but it's explanatory here. pt_test() { echo 5 > /sys/module/dynamic_debug/verbose site="module tcp" # just one callsite echo " $site =_ " > /proc/dynamic_debug/control # clear it # A B ~A ~B for flg in +T +p "-T #broke here" -p; do echo " $site $flg " > /proc/dynamic_debug/control done; # A B ~B ~A for flg in +T +p "-p #broke here" -T; do echo " $site $flg " > /proc/dynamic_debug/control done } pt_test Fixes: 84da83a6ffc0 dyndbg: combine flags & mask into a struct, simplify with it CC: vincent.whitchurch@axis.com Acked-by: Jason Baron <jbaron@akamai.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Link: https://lore.kernel.org/r/20220904214134.408619-2-jim.cromie@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-07	usb: dwc3: core: leave default DMA if the controller does not support 64-bit DMA	William Wu
	On some DWC3 controllers (e.g. Rockchip SoCs), the DWC3 core doesn't support 64-bit DMA address width. In this case, this driver should use the default 32-bit mask. Otherwise, the DWC3 controller will break if it runs on above 4GB physical memory environment. This patch reads the DWC_USB3_AWIDTH bits of GHWPARAMS0 which used for the DMA address width, and only configure 64-bit DMA mask if the DWC_USB3_AWIDTH is 64. Fixes: 45d39448b4d0 ("usb: dwc3: support 64 bit DMA in platform driver") Cc: stable <stable@kernel.org> Reviewed-by: Sven Peter <sven@svenpeter.dev> Signed-off-by: William Wu <william.wu@rock-chips.com> Link: https://lore.kernel.org/r/20220901083446.3799754-1-william.wu@rock-chips.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-07	net: ethernet: mtk_eth_soc: check max allowed hash in mtk_ppe_check_skb	Lorenzo Bianconi
	Even if max hash configured in hw in mtk_ppe_hash_entry is MTK_PPE_ENTRIES - 1, check theoretical OOB accesses in mtk_ppe_check_skb routine Fixes: c4f033d9e03e9 ("net: ethernet: mtk_eth_soc: rework hardware flow table management") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-07	net: skb: export skb drop reaons to user by TRACE_DEFINE_ENUM	Menglong Dong
	As Eric reported, the 'reason' field is not presented when trace the kfree_skb event by perf: $ perf record -e skb:kfree_skb -a sleep 10 $ perf script ip_defrag 14605 [021] 221.614303: skb:kfree_skb: skbaddr=0xffff9d2851242700 protocol=34525 location=0xffffffffa39346b1 reason: The cause seems to be passing kernel address directly to TP_printk(), which is not right. As the enum 'skb_drop_reason' is not exported to user space through TRACE_DEFINE_ENUM(), perf can't get the drop reason string from the 'reason' field, which is a number. Therefore, we introduce the macro DEFINE_DROP_REASON(), which is used to define the trace enum by TRACE_DEFINE_ENUM(). With the help of DEFINE_DROP_REASON(), now we can remove the auto-generate that we introduced in the commit ec43908dd556 ("net: skb: use auto-generation to convert skb drop reason to string"), and define the string array 'drop_reasons'. Hmmmm...now we come back to the situation that have to maintain drop reasons in both enum skb_drop_reason and DEFINE_DROP_REASON. But they are both in dropreason.h, which makes it easier. After this commit, now the format of kfree_skb is like this: $ cat /tracing/events/skb/kfree_skb/format name: kfree_skb ID: 1524 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:void * skbaddr; offset:8; size:8; signed:0; field:void * location; offset:16; size:8; signed:0; field:unsigned short protocol; offset:24; size:2; signed:0; field:enum skb_drop_reason reason; offset:28; size:4; signed:0; print fmt: "skbaddr=%p protocol=%u location=%p reason: %s", REC->skbaddr, REC->protocol, REC->location, __print_symbolic(REC->reason, { 1, "NOT_SPECIFIED" }, { 2, "NO_SOCKET" } ...... Fixes: ec43908dd556 ("net: skb: use auto-generation to convert skb drop reason to string") Link: https://lore.kernel.org/netdev/CANn89i+bx0ybvE55iMYf5GJM48WwV1HNpdm9Q6t-HaEstqpCSA@mail.gmail.com/ Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Menglong Dong <imagedong@tencent.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-07	net: ethernet: mtk_eth_soc: fix typo in __mtk_foe_entry_clear	Lorenzo Bianconi
	Set ib1 state to MTK_FOE_STATE_UNBIND in __mtk_foe_entry_clear routine. Fixes: 33fc42de33278 ("net: ethernet: mtk_eth_soc: support creating mac address based offload entries") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-07	netfilter: nfnetlink_osf: fix possible bogus match in nf_osf_find()	Pablo Neira Ayuso
	nf_osf_find() incorrectly returns true on mismatch, this leads to copying uninitialized memory area in nft_osf which can be used to leak stale kernel stack data to userspace. Fixes: 22c7652cdaa8 ("netfilter: nft_osf: Add version option support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>
2022-09-07	netfilter: nf_conntrack_irc: Tighten matching on DCC message	David Leadbeater
	CTCP messages should only be at the start of an IRC message, not anywhere within it. While the helper only decodes packes in the ORIGINAL direction, its possible to make a client send a CTCP message back by empedding one into a PING request. As-is, thats enough to make the helper believe that it saw a CTCP message. Fixes: 869f37d8e48f ("[NETFILTER]: nf_conntrack/nf_nat: add IRC helper port") Signed-off-by: David Leadbeater <dgl@dgl.cx> Signed-off-by: Florian Westphal <fw@strlen.de>
2022-09-07	iommu/virtio: Fix interaction with VFIO	Jean-Philippe Brucker
	Commit e8ae0e140c05 ("vfio: Require that devices support DMA cache coherence") requires IOMMU drivers to advertise IOMMU_CAP_CACHE_COHERENCY, in order to be used by VFIO. Since VFIO does not provide to userspace the ability to maintain coherency through cache invalidations, it requires hardware coherency. Advertise the capability in order to restore VFIO support. The meaning of IOMMU_CAP_CACHE_COHERENCY also changed from "IOMMU can enforce cache coherent DMA transactions" to "IOMMU_CACHE is supported". While virtio-iommu cannot enforce coherency (of PCIe no-snoop transactions), it does support IOMMU_CACHE. We can distinguish different cases of non-coherent DMA: (1) When accesses from a hardware endpoint are not coherent. The host would describe such a device using firmware methods ('dma-coherent' in device-tree, '_CCA' in ACPI), since they are also needed without a vIOMMU. In this case mappings are created without IOMMU_CACHE. virtio-iommu doesn't need any additional support. It sends the same requests as for coherent devices. (2) When the physical IOMMU supports non-cacheable mappings. Supporting those would require a new feature in virtio-iommu, new PROBE request property and MAP flags. Device drivers would use a new API to discover this since it depends on the architecture and the physical IOMMU. (3) When the hardware supports PCIe no-snoop. It is possible for assigned PCIe devices to issue no-snoop transactions, and the virtio-iommu specification is lacking any mention of this. Arm platforms don't necessarily support no-snoop, and those that do cannot enforce coherency of no-snoop transactions. Device drivers must be careful about assuming that no-snoop transactions won't end up cached; see commit e02f5c1bb228 ("drm: disable uncached DMA optimization for ARM and arm64"). On x86 platforms, the host may or may not enforce coherency of no-snoop transactions with the physical IOMMU. But according to the above commit, on x86 a driver which assumes that no-snoop DMA is compatible with uncached CPU mappings will also work if the host enforces coherency. Although these issues are not specific to virtio-iommu, it could be used to facilitate discovery and configuration of no-snoop. This would require a new feature bit, PROBE property and ATTACH/MAP flags. Cc: stable@vger.kernel.org Fixes: e8ae0e140c05 ("vfio: Require that devices support DMA cache coherence") Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20220825154622.86759-1-jean-philippe@linaro.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/vt-d: Fix lockdep splat due to klist iteration in atomic context	Lu Baolu
	With CONFIG_INTEL_IOMMU_DEBUGFS enabled, below lockdep splat are seen when an I/O fault occurs on a machine with an Intel IOMMU in it. DMAR: DRHD: handling fault status reg 3 DMAR: [DMA Write NO_PASID] Request device [00:1a.0] fault addr 0x0 [fault reason 0x05] PTE Write access is not set DMAR: Dump dmar0 table entries for IOVA 0x0 DMAR: root entry: 0x0000000127f42001 DMAR: context entry: hi 0x0000000000001502, low 0x000000012d8ab001 ================================ WARNING: inconsistent lock state 5.20.0-0.rc0.20220812git7ebfc85e2cd7.10.fc38.x86_64 #1 Not tainted -------------------------------- inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage. rngd/1006 [HC1[1]:SC0[0]:HE0:SE1] takes: ff177021416f2d78 (&k->k_lock){?.+.}-{2:2}, at: klist_next+0x1b/0x160 {HARDIRQ-ON-W} state was registered at: lock_acquire+0xce/0x2d0 _raw_spin_lock+0x33/0x80 klist_add_tail+0x46/0x80 bus_add_device+0xee/0x150 device_add+0x39d/0x9a0 add_memory_block+0x108/0x1d0 memory_dev_init+0xe1/0x117 driver_init+0x43/0x4d kernel_init_freeable+0x1c2/0x2cc kernel_init+0x16/0x140 ret_from_fork+0x1f/0x30 irq event stamp: 7812 hardirqs last enabled at (7811): [<ffffffff85000e86>] asm_sysvec_apic_timer_interrupt+0x16/0x20 hardirqs last disabled at (7812): [<ffffffff84f16894>] irqentry_enter+0x54/0x60 softirqs last enabled at (7794): [<ffffffff840ff669>] __irq_exit_rcu+0xf9/0x170 softirqs last disabled at (7787): [<ffffffff840ff669>] __irq_exit_rcu+0xf9/0x170 The klist iterator functions using spin_lock_irq() but the klist insertion functions using spin_*lock(), combined with the Intel DMAR IOMMU driver iterating over klists from atomic (hardirq) context, where pci_get_domain_bus_and_slot() calls into bus_find_device() which iterates over klists. As currently there's no plan to fix the klist to make it safe to use in atomic context, this fixes the lockdep splat by avoid calling pci_get_domain_bus_and_slot() in the hardirq context. Fixes: 8ac0b64b9735 ("iommu/vt-d: Use pci_get_domain_bus_and_slot() in pgtable_walk()") Reported-by: Lennert Buytenhek <buytenh@wantstofly.org> Link: https://lore.kernel.org/linux-iommu/Yvo2dfpEh%2FWC+Wrr@wantstofly.org/ Link: https://lore.kernel.org/linux-iommu/YvyBdPwrTuHHbn5X@wantstofly.org/ Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220819015949.4795-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/vt-d: Fix recursive lock issue in iommu_flush_dev_iotlb()	Lu Baolu
	The per domain spinlock is acquired in iommu_flush_dev_iotlb(), which is possbile to be called in the interrupt context. For example, the drm-intel's CI system got completely blocked with below error: WARNING: inconsistent lock state 6.0.0-rc1-CI_DRM_11990-g6590d43d39b9+ #1 Not tainted -------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. swapper/6/0 [HC0[0]:SC1[1]:HE1:SE0] takes: ffff88810440d678 (&domain->lock){+.?.}-{2:2}, at: iommu_flush_dev_iotlb.part.61+0x23/0x80 {SOFTIRQ-ON-W} state was registered at: lock_acquire+0xd3/0x310 _raw_spin_lock+0x2a/0x40 domain_update_iommu_cap+0x20b/0x2c0 intel_iommu_attach_device+0x5bd/0x860 __iommu_attach_device+0x18/0xe0 bus_iommu_probe+0x1f3/0x2d0 bus_set_iommu+0x82/0xd0 intel_iommu_init+0xe45/0x102a pci_iommu_init+0x9/0x31 do_one_initcall+0x53/0x2f0 kernel_init_freeable+0x18f/0x1e1 kernel_init+0x11/0x120 ret_from_fork+0x1f/0x30 irq event stamp: 162354 hardirqs last enabled at (162354): [<ffffffff81b59274>] _raw_spin_unlock_irqrestore+0x54/0x70 hardirqs last disabled at (162353): [<ffffffff81b5901b>] _raw_spin_lock_irqsave+0x4b/0x50 softirqs last enabled at (162338): [<ffffffff81e00323>] __do_softirq+0x323/0x48e softirqs last disabled at (162349): [<ffffffff810c1588>] irq_exit_rcu+0xb8/0xe0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&domain->lock); <Interrupt> lock(&domain->lock); * DEADLOCK * 1 lock held by swapper/6/0: This coverts the spin_lock/unlock() into the irq save/restore varieties to fix the recursive locking issues. Fixes: ffd5869d93530 ("iommu/vt-d: Replace spin_lock_irqsave() with spin_lock()") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20220817025650.3253959-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/vt-d: Correctly calculate sagaw value of IOMMU	Lu Baolu
	The Intel IOMMU driver possibly selects between the first-level and the second-level translation tables for DMA address translation. However, the levels of page-table walks for the 4KB base page size are calculated from the SAGAW field of the capability register, which is only valid for the second-level page table. This causes the IOMMU driver to stop working if the hardware (or the emulated IOMMU) advertises only first-level translation capability and reports the SAGAW field as 0. This solves the above problem by considering both the first level and the second level when calculating the supported page table levels. Fixes: b802d070a52a1 ("iommu/vt-d: Use iova over first level") Cc: stable@vger.kernel.org Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220817023558.3253263-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	iommu/vt-d: Fix kdump kernels boot failure with scalable mode	Lu Baolu
	The translation table copying code for kdump kernels is currently based on the extended root/context entry formats of ECS mode defined in older VT-d v2.5, and doesn't handle the scalable mode formats. This causes the kexec capture kernel boot failure with DMAR faults if the IOMMU was enabled in scalable mode by the previous kernel. The ECS mode has already been deprecated by the VT-d spec since v3.0 and Intel IOMMU driver doesn't support this mode as there's no real hardware implementation. Hence this converts ECS checking in copying table code into scalable mode. The existing copying code consumes a bit in the context entry as a mark of copied entry. It needs to work for the old format as well as for the extended context entries. As it's hard to find such a common bit for both legacy and scalable mode context entries. This replaces it with a per- IOMMU bitmap. Fixes: 7373a8cc38197 ("iommu/vt-d: Setup context and enable RID2PASID support") Cc: stable@vger.kernel.org Reported-by: Jerry Snitselaar <jsnitsel@redhat.com> Tested-by: Wen Jin <wen.jin@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220817011035.3250131-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	MIPS: OCTEON: irq: Fix octeon_irq_force_ciu_mapping()	Alexander Sverdlin
	For irq_domain_associate() to work the virq descriptor has to be pre-allocated in advance. Otherwise the following happens: WARNING: CPU: 0 PID: 0 at .../kernel/irq/irqdomain.c:527 irq_domain_associate+0x298/0x2e8 error: virq128 is not allocated Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.78-... #1 ... Call Trace: [<ffffffff801344c4>] show_stack+0x9c/0x130 [<ffffffff80769550>] dump_stack+0x90/0xd0 [<ffffffff801576d0>] __warn+0x118/0x130 [<ffffffff80157734>] warn_slowpath_fmt+0x4c/0x70 [<ffffffff801b83c0>] irq_domain_associate+0x298/0x2e8 [<ffffffff80a43bb8>] octeon_irq_init_ciu+0x4c8/0x53c [<ffffffff80a76cbc>] of_irq_init+0x1e0/0x388 [<ffffffff80a452cc>] init_IRQ+0x4c/0xf4 [<ffffffff80a3cc00>] start_kernel+0x404/0x698 Use irq_alloc_desc_at() to avoid the above problem. Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-09-07	selftests: nft_concat_range: add socat support	Florian Westphal
	There are different flavors of 'nc' around, this script fails on my test vm because 'nc' is 'nmap-ncat' which isn't 100% compatible. Add socat support and use it if available. Signed-off-by: Florian Westphal <fw@strlen.de>
2022-09-07	netfilter: nf_conntrack_sip: fix ct_sip_walk_headers	Igor Ryzhov
	ct_sip_next_header and ct_sip_get_header return an absolute value of matchoff, not a shift from current dataoff. So dataoff should be assigned matchoff, not incremented by it. This issue can be seen in the scenario when there are multiple Contact headers and the first one is using a hostname and other headers use IP addresses. In this case, ct_sip_walk_headers will work as follows: The first ct_sip_get_header call to will find the first Contact header but will return -1 as the header uses a hostname. But matchoff will be changed to the offset of this header. After that, dataoff should be set to matchoff, so that the next ct_sip_get_header call find the next Contact header. But instead of assigning dataoff to matchoff, it is incremented by it, which is not correct, as matchoff is an absolute value of the offset. So on the next call to the ct_sip_get_header, dataoff will be incorrect, and the next Contact header may not be found at all. Fixes: 05e3ced297fe ("[NETFILTER]: nf_conntrack_sip: introduce SIP-URI parsing helper") Signed-off-by: Igor Ryzhov <iryzhov@nfware.com> Signed-off-by: Florian Westphal <fw@strlen.de>
2022-09-07	MIPS: octeon: Get rid of preprocessor directives around RESERVE32	Alexander Sverdlin
	Some of them were pointless because CONFIG_CAVIUM_RESERVE32 is now always defined, some were not enough (Yu Zhao reported "Failed to allocate CAVIUM_RESERVE32 memory area" error). Removing the directives allows for compiler coverage of RESERVE32 code and replacing one of [always-true] "ifdef" with a compiler conditional fixes the [cosmetic] error message. Fixes: 3e3114ac460e ("MIPS: Introduce CAVIUM_RESERVE32 Kconfig option") Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-09-07	Merge branch 'dsa-felix-fixes'	David S. Miller
	Vladimir Oltean says: ==================== Fixes for Felix DSA driver calculation of tc-taprio guard bands This series fixes some bugs which are not quite new, but date from v5.13 when static guard bands were enabled by Michael Walle to prevent tc-taprio overruns. The investigation started when Xiaoliang asked privately what is the expected max SDU for a traffic class when its minimum gate interval is 10 us. The answer, as it turns out, is not an L1 size of 1250 octets, but 1245 octets, since otherwise, the switch will not consider frames for egress scheduling, because the static guard band is exactly as large as the time interval. The switch needs a minimum of 33 ns outside of the guard band to consider a frame for scheduling, and the reduction of the max SDU by 5 provides exactly for that. The fix for that (patch 1/3) is relatively small, but during testing, it became apparent that cut-through forwarding prevents oversized frame dropping from working properly. This is solved through the larger patch 2/3. Finally, patch 3/3 fixes one more tc-taprio locking problem found through code inspection. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-07	net: dsa: felix: access QSYS_TAG_CONFIG under tas_lock in ↵	Vladimir Oltean
	vsc9959_sched_speed_set The read-modify-write of QSYS_TAG_CONFIG from vsc9959_sched_speed_set() runs unlocked with respect to the other functions that access it, which are vsc9959_tas_guard_bands_update(), vsc9959_qos_port_tas_set() and vsc9959_tas_clock_adjust(). All the others are under ocelot->tas_lock, so move the vsc9959_sched_speed_set() access under that lock as well, to resolve the concurrency. Fixes: 55a515b1f5a9 ("net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-07	net: dsa: felix: disable cut-through forwarding for frames oversized for ↵	Vladimir Oltean
	tc-taprio Experimentally, it looks like when QSYS_QMAXSDU_CFG_7 is set to 605, frames even way larger than 601 octets are transmitted even though these should be considered as oversized, according to the documentation, and dropped. Since oversized frame dropping depends on frame size, which is only known at the EOF stage, and therefore not at SOF when cut-through forwarding begins, it means that the switch cannot take QSYS_QMAXSDU_CFG_* into consideration for traffic classes that are cut-through. Since cut-through forwarding has no UAPI to control it, and the driver enables it based on the mantra "if we can, then why not", the strategy is to alter vsc9959_cut_through_fwd() to take into consideration which tc's have oversize frame dropping enabled, and disable cut-through for them. Then, from vsc9959_tas_guard_bands_update(), we re-trigger the cut-through determination process. There are 2 strategies for vsc9959_cut_through_fwd() to determine whether a tc has oversized dropping enabled or not. One is to keep a bit mask of traffic classes per port, and the other is to read back from the hardware registers (a non-zero value of QSYS_QMAXSDU_CFG_* means the feature is enabled). We choose reading back from registers, because struct ocelot_port is shared with drivers (ocelot, seville) that don't support either cut-through nor tc-taprio, and we don't have a felix specific extension of struct ocelot_port. Furthermore, reading registers from the Felix hardware is quite cheap, since they are memory-mapped. Fixes: 55a515b1f5a9 ("net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-07	net: dsa: felix: tc-taprio intervals smaller than MTU should send at least ↵	Vladimir Oltean
	one packet The blamed commit broke tc-taprio schedules such as this one: tc qdisc replace dev $swp1 root taprio \ num_tc 8 \ map 0 1 2 3 4 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ base-time 0 \ sched-entry S 0x7f 990000 \ sched-entry S 0x80 10000 \ flags 0x2 because the gate entry for TC 7 (S 0x80 10000 ns) now has a static guard band added earlier than its 'gate close' event, such that packet overruns won't occur in the worst case of the largest packet possible. Since guard bands are statically determined based on the per-tc QSYS_QMAXSDU_CFG_* with a fallback on the port-based QSYS_PORT_MAX_SDU, we need to discuss what happens with TC 7 depending on kernel version, since the driver, prior to commit 55a515b1f5a9 ("net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port"), did not touch QSYS_QMAXSDU_CFG_, and therefore relied on QSYS_PORT_MAX_SDU. 1 (before vsc9959_tas_guard_bands_update): QSYS_PORT_MAX_SDU defaults to 1518, and at gigabit this introduces a static guard band (independent of packet sizes) of 12144 ns, plus QSYS::HSCH_MISC_CFG.FRM_ADJ (bit time of 20 octets => 160 ns). But this is larger than the time window itself, of 10000 ns. So, the queue system never considers a frame with TC 7 as eligible for transmission, since the gate practically never opens, and these frames are forever stuck in the TX queues and hang the port. 2 (after vsc9959_tas_guard_bands_update): Under the sole goal of enabling oversized frame dropping, we make an effort to set QSYS_QMAXSDU_CFG_7 to 1230 bytes. But QSYS_QMAXSDU_CFG_7 plays one more role, which we did not take into account: per-tc static guard band, expressed in L2 byte time (auto-adjusted for FCS and L1 overhead). There is a discrepancy between what the driver thinks (that there is no guard band, and 100% of min_gate_len[tc] is available for egress scheduling) and what the hardware actually does (crops the equivalent of QSYS_QMAXSDU_CFG_7 ns out of min_gate_len[tc]). In practice, this means that the hardware thinks it has exactly 0 ns for scheduling tc 7. In both cases, even minimum sized Ethernet frames are stuck on egress rather than being considered for scheduling on TC 7, even if they would fit given a proper configuration. Considering the current situation, with vsc9959_tas_guard_bands_update(), frames between 60 octets and 1230 octets in size are not eligible for oversized dropping (because they are smaller than QSYS_QMAXSDU_CFG_7), but won't be considered as eligible for scheduling either, because the min_gate_len[7] (10000 ns) minus the guard band determined by QSYS_QMAXSDU_CFG_7 (1230 octets 8 ns per octet == 9840 ns) minus the guard band auto-added for L1 overhead by QSYS::HSCH_MISC_CFG.FRM_ADJ (20 octets * 8 ns per octet == 160 octets) leaves 0 ns for scheduling in the queue system proper. Investigating the hardware behavior, it becomes apparent that the queue system needs precisely 33 ns of 'gate open' time in order to consider a frame as eligible for scheduling to a tc. So the solution to this problem is to amend vsc9959_tas_guard_bands_update(), by giving the per-tc guard bands less space by exactly 33 ns, just enough for one frame to be scheduled in that interval. This allows the queue system to make forward progress for that port-tc, and prevents it from hanging. Fixes: 297c4de6f780 ("net: dsa: felix: re-enable TAS guard band mode") Reported-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-07	s390/smp: enforce lowcore protection on CPU restart	Alexander Gordeev
	As result of commit 915fea04f932 ("s390/smp: enable DAT before CPU restart callback is called") the low-address protection bit gets mistakenly unset in control register 0 save area of the absolute zero memory. That area is used when manual PSW restart happened to hit an offline CPU. In this case the low-address protection for that CPU will be dropped. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Fixes: 915fea04f932 ("s390/smp: enable DAT before CPU restart callback is called") Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-07	s390/boot: fix absolute zero lowcore corruption on boot	Alexander Gordeev
	Crash dump always starts on CPU0. In case CPU0 is offline the prefix page is not installed and the absolute zero lowcore is used. However, struct lowcore::mcesad is never assigned and stays zero. That leads to __machine_kdump() -> save_vx_regs() call silently stores vector registers to the absolute lowcore at 0x11b0 offset. Fixes: a62bc0739253 ("s390/kdump: add support for vector extension") Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-07	gpio: mpc8xxx: Fix support for IRQ_TYPE_LEVEL_LOW flow_type in mpc85xx	Pali Rohár
	Commit e39d5ef67804 ("powerpc/5xxx: extend mpc8xxx_gpio driver to support mpc512x gpios") implemented support for IRQ_TYPE_LEVEL_LOW flow type in mpc512x via falling edge type. Do same for mpc85xx which support was added in commit 345e5c8a1cc3 ("powerpc: Add interrupt support to mpc8xxx_gpio"). Fixes probing of lm90 hwmon driver on mpc85xx based board which use level interrupt. Without it kernel prints error and refuse lm90 to work: [ 15.258370] genirq: Setting trigger mode 8 for irq 49 failed (mpc8xxx_irq_set_type+0x0/0xf8) [ 15.267168] lm90 0-004c: cannot request IRQ 49 [ 15.272708] lm90: probe of 0-004c failed with error -22 Fixes: 345e5c8a1cc3 ("powerpc: Add interrupt support to mpc8xxx_gpio") Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2022-09-07	ALSA: usb-audio: Clear fixed clock rate at closing EP	Takashi Iwai
	The recent commit c11117b634f4 ("ALSA: usb-audio: Refcount multiple accesses on the single clock") tries to manage the clock rate shared by several endpoints. This was intended for avoiding the unmatched rate by a different endpoint, but unfortunately, it introduced a regression for PulseAudio and pipewire, too; those applications try to probe the multiple possible rates (44.1k and 48kHz) and setting up the normal rate fails but only the last rate is applied. The cause is that the last sample rate is still left to the clock reference even after closing the endpoint, and this value is still used at the next open. It happens only when applications set up via PCM prepare but don't start/stop the stream; the rate is reset when the stream is stopped, but it's not cleared at close. This patch addresses the issue above, simply by clearing the rate set in the clock reference at the last close of each endpoint. Fixes: c11117b634f4 ("ALSA: usb-audio: Refcount multiple accesses on the single clock") Reported-by: Jason A. Donenfeld <Jason@zx2c4.com> Tested-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/all/YxXIWv8dYmg1tnXP@zx2c4.com/ Link: https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/2620 Link: https://lore.kernel.org/r/20220907100421.6443-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-09-07	iommu/amd: use full 64-bit value in build_completion_wait()	John Sperbeck
	We started using a 64 bit completion value. Unfortunately, we only stored the low 32-bits, so a very large completion value would never be matched in iommu_completion_wait(). Fixes: c69d89aff393 ("iommu/amd: Use 4K page for completion wait write-back semaphore") Signed-off-by: John Sperbeck <jsperbeck@google.com> Link: https://lore.kernel.org/r/20220801192229.3358786-1-jsperbeck@google.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-07	dma-mapping: mark dma_supported static	Christoph Hellwig
	Now that the remaining users in drivers are gone, this function can be marked static. Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-09-07	swiotlb: fix a typo	Chao Gao
	"overwirte" isn't a word. It should be "overwrite". Signed-off-by: Chao Gao <chao.gao@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-09-07	swiotlb: avoid potential left shift overflow	Chao Gao
	The second operand passed to slot_addr() is declared as int or unsigned int in all call sites. The left-shift to get the offset of a slot can overflow if swiotlb size is larger than 4G. Convert the macro to an inline function and declare the second argument as phys_addr_t to avoid the potential overflow. Fixes: 26a7e094783d ("swiotlb: refactor swiotlb_tbl_map_single") Signed-off-by: Chao Gao <chao.gao@intel.com> Reviewed-by: Dongli Zhang <dongli.zhang@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-09-07	dma-debug: improve search for partial syncs	Robin Murphy
	When bucket_find_contains() tries to find the original entry for a partial sync, it manages to constrain its search in a way that is both too restrictive and not restrictive enough. A driver which only uses single mappings rather than scatterlists might not set max_seg_size, but could still technically perform a partial sync at an offset of more than 64KB into a sufficiently large mapping, so we could stop searching too early before reaching a legitimate entry. Conversely, if no valid entry is present and max_range is large enough, we can pointlessly search buckets that we've already searched, or that represent an impossible wrapping around the bottom of the address space. At worst, the (legitimate) case of max_seg_size == UINT_MAX can make the loop infinite. Replace the fragile and frankly hard-to-follow "range" logic with a simple counted loop for the number of possible hash buckets below the given address. Reported-by: Yunfei Wang <yf.wang@mediatek.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-09-07	Revert "swiotlb: panic if nslabs is too small"	Yu Zhao
	This reverts commit 0bf28fc40d89b1a3e00d1b79473bad4e9ca20ad1. Reasons: 1. new panic()s shouldn't be added [1]. 2. It does no "cleanup" but breaks MIPS [2]. v2: properly solved the conflict [3] with commit 20347fca71a38 ("swiotlb: split up the global swiotlb lock") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> [1] https://lore.kernel.org/r/CAHk-=wit-DmhMfQErY29JSPjFgebx_Ld+pnerc4J2Ag990WwAA@mail.gmail.com/ [2] https://lore.kernel.org/r/20220820012031.1285979-1-yuzhao@google.com/ [3] https://lore.kernel.org/r/202208310701.LKr1WDCh-lkp@intel.com/ Fixes: 0bf28fc40d89b ("swiotlb: panic if nslabs is too small") Signed-off-by: Yu Zhao <yuzhao@google.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-09-07	RDMA/irdma: Report RNR NAK generation in device caps	Sindhu-Devale
	Report RNR NAK generation when device capabilities are queried Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20220906223244.1119-6-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-07	RDMA/irdma: Use s/g array in post send only when its valid	Sindhu-Devale
	Send with invalidate verb call can pass in an uninitialized s/g array with 0 sge's which is filled into irdma WQE and causes a HW asynchronous event. Fix this by using the s/g array in irdma post send only when its valid. Fixes: 551c46e ("RDMA/irdma: Add user/kernel shared libraries") Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20220906223244.1119-5-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-07	RDMA/irdma: Return correct WC error for bind operation failure	Sindhu-Devale
	When a QP and a MR on a local host are in different PDs, the HW generates an asynchronous event (AE). The same AE is generated when a QP and a MW are in different PDs during a bind operation. Return the more appropriate IBV_WC_MW_BIND_ERR for the latter case by checking the OP type from the CQE in error. Fixes: 551c46edc769 ("RDMA/irdma: Add user/kernel shared libraries") Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20220906223244.1119-4-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-07	RDMA/irdma: Return error on MR deregister CQP failure	Sindhu-Devale
	The MR deregister CQP can fail if an MW is bound to it. Return an appropriate error for this case. Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20220906223244.1119-3-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-07	RDMA/irdma: Report the correct max cqes from query device	Sindhu-Devale
	Report the correct max cqes available to an application taking into account a reserved entry to detect overflow. Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/20220906223244.1119-2-shiraz.saleem@intel.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-07	wifi: iwlwifi: don't spam logs with NSS>2 messages	Jason A. Donenfeld
	I get a log line like this every 4 seconds when connected to my AP: [15650.221468] iwlwifi 0000:09:00.0: Got NSS = 4 - trimming to 2 Looking at the code, this seems to be related to a hardware limitation, and there's nothing to be done. In an effort to keep my dmesg manageable, downgrade this error to "debug" rather than "info". Cc: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20220905172246.105383-1-Jason@zx2c4.com
2022-09-07	efi/x86: libstub: remove unused variable	chen zhang
	The variable "has_system_memory" is unused in function ‘adjust_memory_range_protection’, remove it. Signed-off-by: chen zhang <chenzhang@kylinos.cn> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2022-09-07	nvme: requeue aen after firmware activation	Keith Busch
	The driver prevents async event work while handling a processing paused event, but someone needs to restart it after the controller returns to a live state. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216400 Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-09-07	nvmet: fix mar and mor off-by-one errors	Dennis Maisenbacher
	Maximum Active Resources (MAR) and Maximum Open Resources (MOR) are 0's based vales where a value of 0xffffffff indicates that there is no limit. Decrement the values that are returned by bdev_max_open_zones and bdev_max_active_zones as the block layer helpers are not 0's based. A 0 returned by the block layer helpers indicates no limit, thus convert it to 0xffffffff (U32_MAX). Fixes: aaf2e048af27 ("nvmet: add ZBD over ZNS backend support") Suggested-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Dennis Maisenbacher <dennis.maisenbacher@wdc.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de>