linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2021-12-21	usb: mtu3: fix interval value for intr and isoc	Chunfeng Yun
	Use the Interval value from isoc/intr endpoint descriptor, no need minus one. The original code doesn't cause transfer error for normal cases, but it may have side effect with respond time of ERDY or tPingTimeout. Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com> Link: https://lore.kernel.org/r/20211218095749.6250-1-chunfeng.yun@mediatek.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-21	usb: gadget: f_fs: Clear ffs_eventfd in ffs_data_clear.	Vincent Pelletier
	ffs_data_clear is indirectly called from both ffs_fs_kill_sb and ffs_ep0_release, so it ends up being called twice when userland closes ep0 and then unmounts f_fs. If userland provided an eventfd along with function's USB descriptors, it ends up calling eventfd_ctx_put as many times, causing a refcount underflow. NULL-ify ffs_eventfd to prevent these extraneous eventfd_ctx_put calls. Also, set epfiles to NULL right after de-allocating it, for readability. For completeness, ffs_data_clear actually ends up being called thrice, the last call being before the whole ffs structure gets freed, so when this specific sequence happens there is a second underflow happening (but not being reported): /sys/kernel/debug/tracing# modprobe usb_f_fs /sys/kernel/debug/tracing# echo ffs_data_clear > set_ftrace_filter /sys/kernel/debug/tracing# echo function > current_tracer /sys/kernel/debug/tracing# echo 1 > tracing_on (setup gadget, run and kill function userland process, teardown gadget) /sys/kernel/debug/tracing# echo 0 > tracing_on /sys/kernel/debug/tracing# cat trace smartcard-openp-436 [000] ..... 1946.208786: ffs_data_clear <-ffs_data_closed smartcard-openp-431 [000] ..... 1946.279147: ffs_data_clear <-ffs_data_closed smartcard-openp-431 [000] .n... 1946.905512: ffs_data_clear <-ffs_data_put Warning output corresponding to above trace: [ 1946.284139] WARNING: CPU: 0 PID: 431 at lib/refcount.c:28 refcount_warn_saturate+0x110/0x15c [ 1946.293094] refcount_t: underflow; use-after-free. [ 1946.298164] Modules linked in: usb_f_ncm(E) u_ether(E) usb_f_fs(E) hci_uart(E) btqca(E) btrtl(E) btbcm(E) btintel(E) bluetooth(E) nls_ascii(E) nls_cp437(E) vfat(E) fat(E) bcm2835_v4l2(CE) bcm2835_mmal_vchiq(CE) videobuf2_vmalloc(E) videobuf2_memops(E) sha512_generic(E) videobuf2_v4l2(E) sha512_arm(E) videobuf2_common(E) videodev(E) cpufreq_dt(E) snd_bcm2835(CE) brcmfmac(E) mc(E) vc4(E) ctr(E) brcmutil(E) snd_soc_core(E) snd_pcm_dmaengine(E) drbg(E) snd_pcm(E) snd_timer(E) snd(E) soundcore(E) drm_kms_helper(E) cec(E) ansi_cprng(E) rc_core(E) syscopyarea(E) raspberrypi_cpufreq(E) sysfillrect(E) sysimgblt(E) cfg80211(E) max17040_battery(OE) raspberrypi_hwmon(E) fb_sys_fops(E) regmap_i2c(E) ecdh_generic(E) rfkill(E) ecc(E) bcm2835_rng(E) rng_core(E) vchiq(CE) leds_gpio(E) libcomposite(E) fuse(E) configfs(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) crc32c_generic(E) sdhci_iproc(E) sdhci_pltfm(E) sdhci(E) [ 1946.399633] CPU: 0 PID: 431 Comm: smartcard-openp Tainted: G C OE 5.15.0-1-rpi #1 Debian 5.15.3-1 [ 1946.417950] Hardware name: BCM2835 [ 1946.425442] Backtrace: [ 1946.432048] [<c08d60a0>] (dump_backtrace) from [<c08d62ec>] (show_stack+0x20/0x24) [ 1946.448226] r7:00000009 r6:0000001c r5:c04a948c r4:c0a64e2c [ 1946.458412] [<c08d62cc>] (show_stack) from [<c08d9ae0>] (dump_stack+0x28/0x30) [ 1946.470380] [<c08d9ab8>] (dump_stack) from [<c0123500>] (__warn+0xe8/0x154) [ 1946.482067] r5:c04a948c r4:c0a71dc8 [ 1946.490184] [<c0123418>] (__warn) from [<c08d6948>] (warn_slowpath_fmt+0xa0/0xe4) [ 1946.506758] r7:00000009 r6:0000001c r5:c0a71dc8 r4:c0a71e04 [ 1946.517070] [<c08d68ac>] (warn_slowpath_fmt) from [<c04a948c>] (refcount_warn_saturate+0x110/0x15c) [ 1946.535309] r8:c0100224 r7:c0dfcb84 r6:ffffffff r5:c3b84c00 r4:c24a17c0 [ 1946.546708] [<c04a937c>] (refcount_warn_saturate) from [<c0380134>] (eventfd_ctx_put+0x48/0x74) [ 1946.564476] [<c03800ec>] (eventfd_ctx_put) from [<bf5464e8>] (ffs_data_clear+0xd0/0x118 [usb_f_fs]) [ 1946.582664] r5:c3b84c00 r4:c2695b00 [ 1946.590668] [<bf546418>] (ffs_data_clear [usb_f_fs]) from [<bf547cc0>] (ffs_data_closed+0x9c/0x150 [usb_f_fs]) [ 1946.609608] r5:bf54d014 r4:c2695b00 [ 1946.617522] [<bf547c24>] (ffs_data_closed [usb_f_fs]) from [<bf547da0>] (ffs_fs_kill_sb+0x2c/0x30 [usb_f_fs]) [ 1946.636217] r7:c0dfcb84 r6:c3a12260 r5:bf54d014 r4:c229f000 [ 1946.646273] [<bf547d74>] (ffs_fs_kill_sb [usb_f_fs]) from [<c0326d50>] (deactivate_locked_super+0x54/0x9c) [ 1946.664893] r5:bf54d014 r4:c229f000 [ 1946.672921] [<c0326cfc>] (deactivate_locked_super) from [<c0326df8>] (deactivate_super+0x60/0x64) [ 1946.690722] r5:c2a09000 r4:c229f000 [ 1946.698706] [<c0326d98>] (deactivate_super) from [<c0349a28>] (cleanup_mnt+0xe4/0x14c) [ 1946.715553] r5:c2a09000 r4:00000000 [ 1946.723528] [<c0349944>] (cleanup_mnt) from [<c0349b08>] (__cleanup_mnt+0x1c/0x20) [ 1946.739922] r7:c0dfcb84 r6:c3a12260 r5:c3a126fc r4:00000000 [ 1946.750088] [<c0349aec>] (__cleanup_mnt) from [<c0143d10>] (task_work_run+0x84/0xb8) [ 1946.766602] [<c0143c8c>] (task_work_run) from [<c010bdc8>] (do_work_pending+0x470/0x56c) [ 1946.783540] r7:5ac3c35a r6:c0d0424c r5:c200bfb0 r4:c200a000 [ 1946.793614] [<c010b958>] (do_work_pending) from [<c01000c0>] (slow_work_pending+0xc/0x20) [ 1946.810553] Exception stack(0xc200bfb0 to 0xc200bff8) [ 1946.820129] bfa0: 00000000 00000000 000000aa b5e21430 [ 1946.837104] bfc0: bef867a0 00000001 bef86840 00000034 bef86838 bef86790 bef86794 bef867a0 [ 1946.854125] bfe0: 00000000 bef86798 b67b7a1c b6d626a4 60000010 b5a23760 [ 1946.865335] r10:00000000 r9:c200a000 r8:c0100224 r7:00000034 r6:bef86840 r5:00000001 [ 1946.881914] r4:bef867a0 [ 1946.888793] ---[ end trace 7387f2a9725b28d0 ]--- Fixes: 5e33f6fdf735 ("usb: gadget: ffs: add eventfd notification about ffs events") Cc: stable <stable@vger.kernel.org> Signed-off-by: Vincent Pelletier <plr.vincent@gmail.com> Link: https://lore.kernel.org/r/f79eeea29f3f98de6782a064ec0f7351ad2f598f.1639793920.git.plr.vincent@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-21	ath11k: add regdb.bin download for regdb offload	Wen Gong
	The regdomain is self-managed type for ath11k, the regdomain info is reported from firmware, it is not from wireless regdb. Firmware fetch the regdomain info from board data file before. Currently most of the regdomain info has moved to another file regdb.bin from board data file for some chips such as QCA6390 and WCN6855, so the regdomain info left in board data file is not enough to support the feature which need more regdomain info. After download regdb.bin, firmware will fetch the regdomain info from regdb.bin instead of board data file and report to ath11k. If it does not have the file regdb.bin, it also can initialize wlan success and firmware then fetch regdomain info from board data file. Add download the regdb.bin before download board data for some specific chip which support supports_regdb in hardware parameters. download regdb.bin log: [430082.334162] ath11k_pci 0000:05:00.0: chip_id 0x2 chip_family 0xb board_id 0x106 soc_id 0x400c0200 [430082.334169] ath11k_pci 0000:05:00.0: fw_version 0x110c8b4c fw_build_timestamp 2021-10-25 07:41 fw_build_id QC_IMAGE_VERSION_STRING=WLAN.HSP.1.1-02892-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3 [430082.334414] ath11k_pci 0000:05:00.0: boot firmware request ath11k/WCN6855/hw2.0/regdb.bin size 24310 output of "iw reg get" global country US: DFS-FCC (2402 - 2472 @ 40), (N/A, 30), (N/A) (5170 - 5250 @ 80), (N/A, 23), (N/A), AUTO-BW (5250 - 5330 @ 80), (N/A, 23), (0 ms), DFS, AUTO-BW (5490 - 5730 @ 160), (N/A, 23), (0 ms), DFS (5735 - 5835 @ 80), (N/A, 30), (N/A) (57240 - 63720 @ 2160), (N/A, 40), (N/A) phy#0 (self-managed) country US: DFS-FCC (2402 - 2472 @ 40), (6, 30), (N/A) (5170 - 5250 @ 80), (N/A, 24), (N/A), AUTO-BW (5250 - 5330 @ 80), (N/A, 24), (0 ms), DFS, AUTO-BW (5490 - 5730 @ 160), (N/A, 24), (0 ms), DFS, AUTO-BW (5735 - 5895 @ 160), (N/A, 30), (N/A), AUTO-BW (5945 - 7125 @ 160), (N/A, 24), (N/A), NO-OUTDOOR, AUTO-BW Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-01740-QCAHSTSWPLZ_V2_TO_X86-1 Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1 Signed-off-by: Wen Gong <quic_wgong@quicinc.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://lore.kernel.org/r/20211220062355.17021-1-quic_wgong@quicinc.com
2021-12-20	igb: fix deadlock caused by taking RTNL in RPM resume path	Heiner Kallweit
	Recent net core changes caused an issue with few Intel drivers (reportedly igb), where taking RTNL in RPM resume path results in a deadlock. See [0] for a bug report. I don't think the core changes are wrong, but taking RTNL in RPM resume path isn't needed. The Intel drivers are the only ones doing this. See [1] for a discussion on the issue. Following patch changes the RPM resume path to not take RTNL. [0] https://bugzilla.kernel.org/show_bug.cgi?id=215129 [1] https://lore.kernel.org/netdev/20211125074949.5f897431@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com/t/ Fixes: bd869245a3dc ("net: core: try to runtime-resume detached device in __dev_open") Fixes: f32a21376573 ("ethtool: runtime-resume netdev parent before ethtool ioctl ops") Tested-by: Martin Stolpe <martin.stolpe@gmail.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20211220201844.2714498-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-20	gve: Correct order of processing device options	Jeroen de Borst
	The legacy raw addressing device option was processed before the new RDA queue format option. This caused the supported features mask, which is provided only on the RDA queue format option, not to be set. This disabled jumbo-frame support when using raw adressing. Fixes: 255489f5b33c ("gve: Add a jumbo-frame device option") Signed-off-by: Jeroen de Borst <jeroendb@google.com> Link: https://lore.kernel.org/r/20211220192746.2900594-1-jeroendb@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-20	net: skip virtio_net_hdr_set_proto if protocol already set	Willem de Bruijn
	virtio_net_hdr_set_proto infers skb->protocol from the virtio_net_hdr gso_type, to avoid packets getting dropped for lack of a proto type. Its protocol choice is a guess, especially in the case of UFO, where the single VIRTIO_NET_HDR_GSO_UDP label covers both UFOv4 and UFOv6. Skip this best effort if the field is already initialized. Whether explicitly from userspace, or implicitly based on an earlier call to dev_parse_header_protocol (which is more robust, but was introduced after this patch). Fixes: 9d2f67e43b73 ("net/packet: fix packet drop as of virtio gso") Signed-off-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20211220145027.2784293-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-20	net: accept UFOv6 packages in virtio_net_hdr_to_skb	Willem de Bruijn
	Skb with skb->protocol 0 at the time of virtio_net_hdr_to_skb may have a protocol inferred from virtio_net_hdr with virtio_net_hdr_set_proto. Unlike TCP, UDP does not have separate types for IPv4 and IPv6. Type VIRTIO_NET_HDR_GSO_UDP is guessed to be IPv4/UDP. As of the below commit, UFOv6 packets are dropped due to not matching the protocol as obtained from dev_parse_header_protocol. Invert the test to take that L2 protocol field as starting point and pass both UFOv4 and UFOv6 for VIRTIO_NET_HDR_GSO_UDP. Fixes: 924a9bc362a5 ("net: check if protocol extracted by virtio_net_hdr_set_proto is correct") Link: https://lore.kernel.org/netdev/CABcq3pG9GRCYqFDBAJ48H1vpnnX=41u+MhQnayF1ztLH4WX0Fw@mail.gmail.com/ Reported-by: Andrew Melnichenko <andrew@daynix.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20211220144901.2784030-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-20	docs: networking: replace skb_hwtstamp_tx with skb_tstamp_tx	Willem de Bruijn
	Tiny doc fix. The hardware transmit function was called skb_tstamp_tx from its introduction in commit ac45f602ee3d ("net: infrastructure for hardware time stamping") in the same series as this documentation. Fixes: cb9eff097831 ("net: new user space API for time stamping of incoming and outgoing packets") Signed-off-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20211220144608.2783526-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-20	inet: fully convert sk->sk_rx_dst to RCU rules	Eric Dumazet
	syzbot reported various issues around early demux, one being included in this changelog [1] sk->sk_rx_dst is using RCU protection without clearly documenting it. And following sequences in tcp_v4_do_rcv()/tcp_v6_do_rcv() are not following standard RCU rules. [a] dst_release(dst); [b] sk->sk_rx_dst = NULL; They look wrong because a delete operation of RCU protected pointer is supposed to clear the pointer before the call_rcu()/synchronize_rcu() guarding actual memory freeing. In some cases indeed, dst could be freed before [b] is done. We could cheat by clearing sk_rx_dst before calling dst_release(), but this seems the right time to stick to standard RCU annotations and debugging facilities. [1] BUG: KASAN: use-after-free in dst_check include/net/dst.h:470 [inline] BUG: KASAN: use-after-free in tcp_v4_early_demux+0x95b/0x960 net/ipv4/tcp_ipv4.c:1792 Read of size 2 at addr ffff88807f1cb73a by task syz-executor.5/9204 CPU: 0 PID: 9204 Comm: syz-executor.5 Not tainted 5.16.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 print_address_description.constprop.0.cold+0x8d/0x320 mm/kasan/report.c:247 __kasan_report mm/kasan/report.c:433 [inline] kasan_report.cold+0x83/0xdf mm/kasan/report.c:450 dst_check include/net/dst.h:470 [inline] tcp_v4_early_demux+0x95b/0x960 net/ipv4/tcp_ipv4.c:1792 ip_rcv_finish_core.constprop.0+0x15de/0x1e80 net/ipv4/ip_input.c:340 ip_list_rcv_finish.constprop.0+0x1b2/0x6e0 net/ipv4/ip_input.c:583 ip_sublist_rcv net/ipv4/ip_input.c:609 [inline] ip_list_rcv+0x34e/0x490 net/ipv4/ip_input.c:644 __netif_receive_skb_list_ptype net/core/dev.c:5508 [inline] __netif_receive_skb_list_core+0x549/0x8e0 net/core/dev.c:5556 __netif_receive_skb_list net/core/dev.c:5608 [inline] netif_receive_skb_list_internal+0x75e/0xd80 net/core/dev.c:5699 gro_normal_list net/core/dev.c:5853 [inline] gro_normal_list net/core/dev.c:5849 [inline] napi_complete_done+0x1f1/0x880 net/core/dev.c:6590 virtqueue_napi_complete drivers/net/virtio_net.c:339 [inline] virtnet_poll+0xca2/0x11b0 drivers/net/virtio_net.c:1557 __napi_poll+0xaf/0x440 net/core/dev.c:7023 napi_poll net/core/dev.c:7090 [inline] net_rx_action+0x801/0xb40 net/core/dev.c:7177 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558 invoke_softirq kernel/softirq.c:432 [inline] __irq_exit_rcu+0x123/0x180 kernel/softirq.c:637 irq_exit_rcu+0x5/0x20 kernel/softirq.c:649 common_interrupt+0x52/0xc0 arch/x86/kernel/irq.c:240 asm_common_interrupt+0x1e/0x40 arch/x86/include/asm/idtentry.h:629 RIP: 0033:0x7f5e972bfd57 Code: 39 d1 73 14 0f 1f 80 00 00 00 00 48 8b 50 f8 48 83 e8 08 48 39 ca 77 f3 48 39 c3 73 3e 48 89 13 48 8b 50 f8 48 89 38 49 8b 0e <48> 8b 3e 48 83 c3 08 48 83 c6 08 eb bc 48 39 d1 72 9e 48 39 d0 73 RSP: 002b:00007fff8a413210 EFLAGS: 00000283 RAX: 00007f5e97108990 RBX: 00007f5e97108338 RCX: ffffffff81d3aa45 RDX: ffffffff81d3aa45 RSI: 00007f5e97108340 RDI: ffffffff81d3aa45 RBP: 00007f5e97107eb8 R08: 00007f5e97108d88 R09: 0000000093c2e8d9 R10: 0000000000000000 R11: 0000000000000000 R12: 00007f5e97107eb0 R13: 00007f5e97108338 R14: 00007f5e97107ea8 R15: 0000000000000019 </TASK> Allocated by task 13: kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38 kasan_set_track mm/kasan/common.c:46 [inline] set_alloc_info mm/kasan/common.c:434 [inline] __kasan_slab_alloc+0x90/0xc0 mm/kasan/common.c:467 kasan_slab_alloc include/linux/kasan.h:259 [inline] slab_post_alloc_hook mm/slab.h:519 [inline] slab_alloc_node mm/slub.c:3234 [inline] slab_alloc mm/slub.c:3242 [inline] kmem_cache_alloc+0x202/0x3a0 mm/slub.c:3247 dst_alloc+0x146/0x1f0 net/core/dst.c:92 rt_dst_alloc+0x73/0x430 net/ipv4/route.c:1613 ip_route_input_slow+0x1817/0x3a20 net/ipv4/route.c:2340 ip_route_input_rcu net/ipv4/route.c:2470 [inline] ip_route_input_noref+0x116/0x2a0 net/ipv4/route.c:2415 ip_rcv_finish_core.constprop.0+0x288/0x1e80 net/ipv4/ip_input.c:354 ip_list_rcv_finish.constprop.0+0x1b2/0x6e0 net/ipv4/ip_input.c:583 ip_sublist_rcv net/ipv4/ip_input.c:609 [inline] ip_list_rcv+0x34e/0x490 net/ipv4/ip_input.c:644 __netif_receive_skb_list_ptype net/core/dev.c:5508 [inline] __netif_receive_skb_list_core+0x549/0x8e0 net/core/dev.c:5556 __netif_receive_skb_list net/core/dev.c:5608 [inline] netif_receive_skb_list_internal+0x75e/0xd80 net/core/dev.c:5699 gro_normal_list net/core/dev.c:5853 [inline] gro_normal_list net/core/dev.c:5849 [inline] napi_complete_done+0x1f1/0x880 net/core/dev.c:6590 virtqueue_napi_complete drivers/net/virtio_net.c:339 [inline] virtnet_poll+0xca2/0x11b0 drivers/net/virtio_net.c:1557 __napi_poll+0xaf/0x440 net/core/dev.c:7023 napi_poll net/core/dev.c:7090 [inline] net_rx_action+0x801/0xb40 net/core/dev.c:7177 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558 Freed by task 13: kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38 kasan_set_track+0x21/0x30 mm/kasan/common.c:46 kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:370 ____kasan_slab_free mm/kasan/common.c:366 [inline] ____kasan_slab_free mm/kasan/common.c:328 [inline] __kasan_slab_free+0xff/0x130 mm/kasan/common.c:374 kasan_slab_free include/linux/kasan.h:235 [inline] slab_free_hook mm/slub.c:1723 [inline] slab_free_freelist_hook+0x8b/0x1c0 mm/slub.c:1749 slab_free mm/slub.c:3513 [inline] kmem_cache_free+0xbd/0x5d0 mm/slub.c:3530 dst_destroy+0x2d6/0x3f0 net/core/dst.c:127 rcu_do_batch kernel/rcu/tree.c:2506 [inline] rcu_core+0x7ab/0x1470 kernel/rcu/tree.c:2741 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558 Last potentially related work creation: kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38 __kasan_record_aux_stack+0xf5/0x120 mm/kasan/generic.c:348 __call_rcu kernel/rcu/tree.c:2985 [inline] call_rcu+0xb1/0x740 kernel/rcu/tree.c:3065 dst_release net/core/dst.c:177 [inline] dst_release+0x79/0xe0 net/core/dst.c:167 tcp_v4_do_rcv+0x612/0x8d0 net/ipv4/tcp_ipv4.c:1712 sk_backlog_rcv include/net/sock.h:1030 [inline] __release_sock+0x134/0x3b0 net/core/sock.c:2768 release_sock+0x54/0x1b0 net/core/sock.c:3300 tcp_sendmsg+0x36/0x40 net/ipv4/tcp.c:1441 inet_sendmsg+0x99/0xe0 net/ipv4/af_inet.c:819 sock_sendmsg_nosec net/socket.c:704 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:724 sock_write_iter+0x289/0x3c0 net/socket.c:1057 call_write_iter include/linux/fs.h:2162 [inline] new_sync_write+0x429/0x660 fs/read_write.c:503 vfs_write+0x7cd/0xae0 fs/read_write.c:590 ksys_write+0x1ee/0x250 fs/read_write.c:643 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae The buggy address belongs to the object at ffff88807f1cb700 which belongs to the cache ip_dst_cache of size 176 The buggy address is located 58 bytes inside of 176-byte region [ffff88807f1cb700, ffff88807f1cb7b0) The buggy address belongs to the page: page:ffffea0001fc72c0 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7f1cb flags: 0xfff00000000200(slab\|node=0\|zone=1\|lastcpupid=0x7ff) raw: 00fff00000000200 dead000000000100 dead000000000122 ffff8881413bb780 raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 0, migratetype Unmovable, gfp_mask 0x112a20(GFP_ATOMIC\|__GFP_NOWARN\|__GFP_NORETRY\|__GFP_HARDWALL), pid 5, ts 108466983062, free_ts 108048976062 prep_new_page mm/page_alloc.c:2418 [inline] get_page_from_freelist+0xa72/0x2f50 mm/page_alloc.c:4149 __alloc_pages+0x1b2/0x500 mm/page_alloc.c:5369 alloc_pages+0x1a7/0x300 mm/mempolicy.c:2191 alloc_slab_page mm/slub.c:1793 [inline] allocate_slab mm/slub.c:1930 [inline] new_slab+0x32d/0x4a0 mm/slub.c:1993 ___slab_alloc+0x918/0xfe0 mm/slub.c:3022 __slab_alloc.constprop.0+0x4d/0xa0 mm/slub.c:3109 slab_alloc_node mm/slub.c:3200 [inline] slab_alloc mm/slub.c:3242 [inline] kmem_cache_alloc+0x35c/0x3a0 mm/slub.c:3247 dst_alloc+0x146/0x1f0 net/core/dst.c:92 rt_dst_alloc+0x73/0x430 net/ipv4/route.c:1613 __mkroute_output net/ipv4/route.c:2564 [inline] ip_route_output_key_hash_rcu+0x921/0x2d00 net/ipv4/route.c:2791 ip_route_output_key_hash+0x18b/0x300 net/ipv4/route.c:2619 __ip_route_output_key include/net/route.h:126 [inline] ip_route_output_flow+0x23/0x150 net/ipv4/route.c:2850 ip_route_output_key include/net/route.h:142 [inline] geneve_get_v4_rt+0x3a6/0x830 drivers/net/geneve.c:809 geneve_xmit_skb drivers/net/geneve.c:899 [inline] geneve_xmit+0xc4a/0x3540 drivers/net/geneve.c:1082 __netdev_start_xmit include/linux/netdevice.h:4994 [inline] netdev_start_xmit include/linux/netdevice.h:5008 [inline] xmit_one net/core/dev.c:3590 [inline] dev_hard_start_xmit+0x1eb/0x920 net/core/dev.c:3606 __dev_queue_xmit+0x299a/0x3650 net/core/dev.c:4229 page last free stack trace: reset_page_owner include/linux/page_owner.h:24 [inline] free_pages_prepare mm/page_alloc.c:1338 [inline] free_pcp_prepare+0x374/0x870 mm/page_alloc.c:1389 free_unref_page_prepare mm/page_alloc.c:3309 [inline] free_unref_page+0x19/0x690 mm/page_alloc.c:3388 qlink_free mm/kasan/quarantine.c:146 [inline] qlist_free_all+0x5a/0xc0 mm/kasan/quarantine.c:165 kasan_quarantine_reduce+0x180/0x200 mm/kasan/quarantine.c:272 __kasan_slab_alloc+0xa2/0xc0 mm/kasan/common.c:444 kasan_slab_alloc include/linux/kasan.h:259 [inline] slab_post_alloc_hook mm/slab.h:519 [inline] slab_alloc_node mm/slub.c:3234 [inline] kmem_cache_alloc_node+0x255/0x3f0 mm/slub.c:3270 __alloc_skb+0x215/0x340 net/core/skbuff.c:414 alloc_skb include/linux/skbuff.h:1126 [inline] alloc_skb_with_frags+0x93/0x620 net/core/skbuff.c:6078 sock_alloc_send_pskb+0x783/0x910 net/core/sock.c:2575 mld_newpack+0x1df/0x770 net/ipv6/mcast.c:1754 add_grhead+0x265/0x330 net/ipv6/mcast.c:1857 add_grec+0x1053/0x14e0 net/ipv6/mcast.c:1995 mld_send_initial_cr.part.0+0xf6/0x230 net/ipv6/mcast.c:2242 mld_send_initial_cr net/ipv6/mcast.c:1232 [inline] mld_dad_work+0x1d3/0x690 net/ipv6/mcast.c:2268 process_one_work+0x9b2/0x1690 kernel/workqueue.c:2298 worker_thread+0x658/0x11f0 kernel/workqueue.c:2445 Memory state around the buggy address: ffff88807f1cb600: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88807f1cb680: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc >ffff88807f1cb700: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88807f1cb780: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc ffff88807f1cb800: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Fixes: 41063e9dd119 ("ipv4: Early TCP socket demux.") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20211220143330.680945-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-20	Merge branch 'net-amd-xgbe-add-support-for-yellow-carp-ethernet-device'	Jakub Kicinski
	Raju Rangoju says: ==================== net: amd-xgbe: Add support for Yellow Carp Ethernet device Add support for newer version of Hardware, the Yellow Carp Ethernet device ==================== Link: https://lore.kernel.org/r/20211220135428.1123575-1-rrangoju@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-20	net: amd-xgbe: Disable the CDR workaround path for Yellow Carp Devices	Raju Rangoju
	Yellow Carp Ethernet devices do not require Autonegotiation CDR workaround, hence disable the same. Co-developed-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-20	net: amd-xgbe: Alter the port speed bit range	Raju Rangoju
	Newer generation Hardware uses the slightly different port speed bit widths, so alter the existing port speed bit range to extend support to the newer generation hardware while maintaining the backward compatibility with older generation hardware. The previously reserved bits are now being used which then requires the adjustment to the BIT values, e.g.: Before: PORT_PROPERTY_0[22:21] - Reserved PORT_PROPERTY_0[26:23] - Supported Speeds After: PORT_PROPERTY_0[21] - Reserved PORT_PROPERTY_0[26:22] - Supported Speeds To make this backwards compatible, the existing BIT definitions for the port speeds are incremented by one to maintain the original position. Co-developed-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-20	net: amd-xgbe: Add Support for Yellow Carp Ethernet device	Raju Rangoju
	Yellow Carp Ethernet devices use the existing PCI ID but the window settings for the indirect PCS access have been altered. Add the check for Yellow Carp Ethernet devices to use the new register values. Co-developed-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-20	mctp: emit RTM_NEWADDR and RTM_DELADDR	Matt Johnston
	Userspace can receive notification of MCTP address changes via RTNLGRP_MCTP_IFADDR rtnetlink multicast group. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Link: https://lore.kernel.org/r/20211220023104.1965509-1-matt@codeconstruct.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-21	powerpc/ptdump: Fix DEBUG_WX since generic ptdump conversion	Michael Ellerman
	In note_prot_wx() we bail out without reporting anything if CONFIG_PPC_DEBUG_WX is disabled. But CONFIG_PPC_DEBUG_WX was removed in the conversion to generic ptdump, we now need to use CONFIG_DEBUG_WX instead. Fixes: e084728393a5 ("powerpc/ptdump: Convert powerpc to GENERIC_PTDUMP") Cc: stable@vger.kernel.org # v5.15+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Link: https://lore.kernel.org/r/20211203124112.2912562-1-mpe@ellerman.id.au
2021-12-20	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma	Linus Torvalds
	Pull rdma fixes from Jason Gunthorpe: "Last fixes before holidays. Nothing very exciting: - Work around a HW bug in HNS HIP08 - Recent memory leak regression in qib - Incorrect use of kfree() for vmalloc memory in hns" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/hns: Replace kfree() with kvfree() IB/qib: Fix memory leak in qib_user_sdma_queue_pkts() RDMA/hns: Fix RNR retransmission issue for HIP08
2021-12-20	rtlwifi: rtl8192cu: Fix WARNING when calling local_irq_restore() with ↵	Larry Finger
	interrupts enabled Syzbot reports the following WARNING: [200~raw_local_irq_restore() called with IRQs enabled WARNING: CPU: 1 PID: 1206 at kernel/locking/irqflag-debug.c:10 warn_bogus_irq_restore+0x1d/0x20 kernel/locking/irqflag-debug.c:10 Hardware initialization for the rtl8188cu can run for as long as 350 ms, and the routine may be called with interrupts disabled. To avoid locking the machine for this long, the current routine saves the interrupt flags and enables local interrupts. The problem is that it restores the flags at the end without disabling local interrupts first. This patch fixes commit a53268be0cb9 ("rtlwifi: rtl8192cu: Fix too long disable of IRQs"). Reported-by: syzbot+cce1ee31614c171f5595@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Fixes: a53268be0cb9 ("rtlwifi: rtl8192cu: Fix too long disable of IRQs") Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20211215171105.20623-1-Larry.Finger@lwfinger.net
2021-12-20	rtl8xxxu: Improve the A-MPDU retransmission rate with RTS/CTS protection	Chris Chiu
	The A-MPDU TX retransmission rate is always high (> 20%) even in a very clean environment. However, the vendor driver retransimission rate is < 10% in the same test bed. The difference is the vendor driver starts the A-MPDU TXOP with initial RTS/CTS handshake which is observed in the air capture and the TX descriptor. Since the driver does not know how many frames will be aggregated and the estimated duration, forcing the RTS/CTS protection for A-MPDU helps to lower the retransmission rate from > 20% to ~12% in the same test setup with the vendor driver. Signed-off-by: Chris Chiu <chris.chiu@canonical.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20211215085819.729345-1-chris.chiu@canonical.com
2021-12-20	selftests/bpf: Correct the INDEX address in vmtest.sh	Pu Lehui
	Migration of vmtest to libbpf/ci will change the address of INDEX in vmtest.sh, which will cause vmtest.sh to not work due to the failure of rootfs fetching. Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Lorenzo Fontana <lorenzo.fontana@elastic.co> Link: https://lore.kernel.org/bpf/20211220050803.2670677-1-pulehui@huawei.com
2021-12-20	rtw88: 8822c: update rx settings to prevent potential hw deadlock	Po-Hao Huang
	These settings enables mac to detect and recover when rx fifo circuit deadlock occurs. Previous version missed this, so we fix it. Signed-off-by: Po-Hao Huang <phhuang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20211217012708.8623-1-pkshih@realtek.com
2021-12-20	rtw88: don't check CRC of VHT-SIG-B in 802.11ac signal	Chin-Yen Lee
	Currently all realtek wifi chip is set to check CRC of VHT-SIG-B in 802.11ac signal, but some AP don't calculate the CRC and packets from these AP can't be received and lead to disconnection. We disable the check defaultly to avoid this case. Signed-off-by: Chin-Yen Lee <timlee@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20211217012421.7859-1-pkshih@realtek.com
2021-12-20	rtw88: Disable PCIe ASPM while doing NAPI poll on 8821CE	Kai-Heng Feng
	Many Intel based platforms face system random freeze after commit 9e2fd29864c5 ("rtw88: add napi support"). The commit itself shouldn't be the culprit. My guess is that the 8821CE only leaves ASPM L1 for a short period when IRQ is raised. Since IRQ is masked during NAPI polling, the PCIe link stays at L1 and makes RX DMA extremely slow. Eventually the RX ring becomes messed up: [ 1133.194697] rtw_8821ce 0000:02:00.0: pci bus timeout, check dma status Since the 8821CE hardware may fail to leave ASPM L1, manually do it in the driver to resolve the issue. Fixes: 9e2fd29864c5 ("rtw88: add napi support") Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215131 BugLink: https://bugs.launchpad.net/bugs/1927808 Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Acked-by: Jian-Hong Pan <jhp@endlessos.org> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20211215114635.333767-1-kai.heng.feng@canonical.com
2021-12-20	wilc1000: fix double free error in probe()	Dan Carpenter
	Smatch complains that there is a double free in probe: drivers/net/wireless/microchip/wilc1000/spi.c:186 wilc_bus_probe() error: double free of 'spi_priv' drivers/net/wireless/microchip/wilc1000/sdio.c:163 wilc_sdio_probe() error: double free of 'sdio_priv' The problem is that wilc_netdev_cleanup() function frees "wilc->bus_data". That's confusing and a layering violation. Leave the frees in probe(), delete the free in wilc_netdev_cleanup(), and add some new frees to the remove() functions. Fixes: dc8b338f3bcd ("wilc1000: use goto labels on error path") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20211217150311.GC16611@kili
2021-12-20	iwlwifi: mvm: fix imbalanced locking in iwl_mvm_start_get_nvm()	Luca Coelho
	If iwl_transt_start_hw() failed, we were returning without calling wiphy_unlock() and rtnl_unlock(), causing a locking imbalance: drivers/net/wireless/intel/iwlwifi/mvm/ops.c:686:12: warning: context imbalance in 'iwl_mvm_start_get_nvm' - wrong count at exit Fix that by adding the unlock calls. Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20211219090128.42417-2-luca@coelho.fi
2021-12-20	iwlwifi: mvm: add dbg_time_point to debugfs	Johannes Berg
	We forgot to link this to debugfs, so the code is all dead. Add it for real. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/iwlwifi.20211219110000.d0f314101410.I7357c01179c35621686265d4da4a64d2333a5f1a@changeid
2021-12-20	iwlwifi: mvm: add missing min_size to kernel-doc	Johannes Berg
	On struct iwl_rx_handlers we should document the min_size member, do that. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/iwlwifi.20211219110000.0c42c428bc6b.I8bfa49d534acc5f513f2fb3dff2d6f22f6c45071@changeid
2021-12-20	iwlwifi: mei: fix W=1 warnings	Johannes Berg
	There are a few warnings due to kernel-doc not understanding the constructs the way they're done here, fix them. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/iwlwifi.20211219110000.1ef2bb24771c.I6a59ad2d64f719d3e27398951c8f1b678b0b1092@changeid
2021-12-20	Merge tag 'mt76-for-kvalo-2021-12-18' of https://github.com/nbd168/wireless	Kalle Valo
	mt76 patches for 5.17 * decap offload fixes * mt7915 fixes * mt7921 fixes * eeprom fixes * powersave handling fixes * SAR support * code cleanups
2021-12-20	ath11k: add support for hardware rfkill for QCA6390	Wen Gong
	When hardware rfkill is enabled in the firmware it will report the capability via using WMI_SYS_CAP_INFO_RFKILL bit in the WMI_SERVICE_READY event to the host. ath11k will check the capability, and if it is enabled then ath11k will set the GPIO information to firmware using WMI_PDEV_SET_PARAM. When the firmware detects hardware rfkill is enabled by the user, it will report it via WMI_RFKILL_STATE_CHANGE_EVENTID. Once ath11k receives the event it will send wmi command WMI_PDEV_SET_PARAM to the firmware and also notifies cfg80211. This only enable rfkill feature for QCA6390, rfkill_pin is all initialized to 0 for other chips in ath11k_hw_params. Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-01740-QCAHSTSWPLZ_V2_TO_X86-1 Signed-off-by: Wen Gong <quic_wgong@quicinc.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://lore.kernel.org/r/20211217102334.14907-1-quic_wgong@quicinc.com
2021-12-20	ath11k: report tx bitrate for iw wlan station dump	Wen Gong
	HTT_T2H_MSG_TYPE_PPDU_STATS_IND is a message which include the ppdu info, currently it is not report from firmware for ath11k, then the tx bitrate of "iw wlan0 station dump" always show an invalid value "tx bitrate: 6.0 MBit/s". To address the issue, this is to parse the info of tx complete report from firmware and indicate the tx rate to mac80211. After that, "iw wlan0 station dump" show the correct tx bit rate such as: tx bitrate: 78.0 MBit/s MCS 12 tx bitrate: 144.4 MBit/s VHT-MCS 7 short GI VHT-NSS 2 tx bitrate: 286.7 MBit/s HE-MCS 11 HE-NSS 2 HE-GI 0 HE-DCM 0 tx bitrate: 1921.5 MBit/s 160MHz HE-MCS 9 HE-NSS 2 HE-GI 0 HE-DCM 0 Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-01740-QCAHSTSWPLZ_V2_TO_X86-1 Signed-off-by: Wen Gong <quic_wgong@quicinc.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://lore.kernel.org/r/20211217093722.5739-1-quic_wgong@quicinc.com
2021-12-20	Merge tag 'spi-fix-v5.16-rc6' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fix from Mark Brown: "One small fix for a long standing issue with error handling on probe in the Armada driver" * tag 'spi-fix-v5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: change clk_disable_unprepare to clk_unprepare
2021-12-20	Merge tag 'regulator-fix-v5.16-rc6' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fix from Mark Brown: "Binding fix for v5.16 This fixes problems validating DT bindings using op_mode which wasn't described as it should have been when converting to DT schema" * tag 'regulator-fix-v5.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: dt-bindings: samsung,s5m8767: add missing op_mode to bucks
2021-12-20	ath9k: Fix out-of-bound memcpy in ath9k_hif_usb_rx_stream	Zekun Shen
	Large pkt_len can lead to out-out-bound memcpy. Current ath9k_hif_usb_rx_stream allows combining the content of two urb inputs to one pkt. The first input can indicate the size of the pkt. Any remaining size is saved in hif_dev->rx_remain_len. While processing the next input, memcpy is used with rx_remain_len. 4-byte pkt_len can go up to 0xffff, while a single input is 0x4000 maximum in size (MAX_RX_BUF_SIZE). Thus, the patch adds a check for pkt_len which must not exceed 2 * MAX_RX_BUG_SIZE. BUG: KASAN: slab-out-of-bounds in ath9k_hif_usb_rx_cb+0x490/0xed7 [ath9k_htc] Read of size 46393 at addr ffff888018798000 by task kworker/0:1/23 CPU: 0 PID: 23 Comm: kworker/0:1 Not tainted 5.6.0 #63 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014 Workqueue: events request_firmware_work_func Call Trace: <IRQ> dump_stack+0x76/0xa0 print_address_description.constprop.0+0x16/0x200 ? ath9k_hif_usb_rx_cb+0x490/0xed7 [ath9k_htc] ? ath9k_hif_usb_rx_cb+0x490/0xed7 [ath9k_htc] __kasan_report.cold+0x37/0x7c ? ath9k_hif_usb_rx_cb+0x490/0xed7 [ath9k_htc] kasan_report+0xe/0x20 check_memory_region+0x15a/0x1d0 memcpy+0x20/0x50 ath9k_hif_usb_rx_cb+0x490/0xed7 [ath9k_htc] ? hif_usb_mgmt_cb+0x2d9/0x2d9 [ath9k_htc] ? _raw_spin_lock_irqsave+0x7b/0xd0 ? _raw_spin_trylock_bh+0x120/0x120 ? __usb_unanchor_urb+0x12f/0x210 __usb_hcd_giveback_urb+0x1e4/0x380 usb_giveback_urb_bh+0x241/0x4f0 ? __hrtimer_run_queues+0x316/0x740 ? __usb_hcd_giveback_urb+0x380/0x380 tasklet_action_common.isra.0+0x135/0x330 __do_softirq+0x18c/0x634 irq_exit+0x114/0x140 smp_apic_timer_interrupt+0xde/0x380 apic_timer_interrupt+0xf/0x20 I found the bug using a custome USBFuzz port. It's a research work to fuzz USB stack/drivers. I modified it to fuzz ath9k driver only, providing hand-crafted usb descriptors to QEMU. After fixing the value of pkt_tag to ATH_USB_RX_STREAM_MODE_TAG in QEMU emulation, I found the KASAN report. The bug is triggerable whenever pkt_len is above two MAX_RX_BUG_SIZE. I used the same input that crashes to test the driver works when applying the patch. Signed-off-by: Zekun Shen <bruceshenzk@gmail.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://lore.kernel.org/r/YXsidrRuK6zBJicZ@10-18-43-117.dynapool.wireless.nyu.edu
2021-12-20	ath9k_htc: fix NULL pointer dereference at ath9k_htc_tx_get_packet()	Tetsuo Handa
	syzbot is reporting lockdep warning at ath9k_wmi_event_tasklet() followed by kernel panic at get_htc_epid_queue() from ath9k_htc_tx_get_packet() from ath9k_htc_txstatus() [1], for ath9k_wmi_event_tasklet(WMI_TXSTATUS_EVENTID) depends on spin_lock_init() from ath9k_init_priv() being already completed. Since ath9k_wmi_event_tasklet() is set by ath9k_init_wmi() from ath9k_htc_probe_device(), it is possible that ath9k_wmi_event_tasklet() is called via tasklet interrupt before spin_lock_init() from ath9k_init_priv() from ath9k_init_device() from ath9k_htc_probe_device() is called. Let's hold ath9k_wmi_event_tasklet(WMI_TXSTATUS_EVENTID) no-op until ath9k_tx_init() completes. Link: https://syzkaller.appspot.com/bug?extid=31d54c60c5b254d6f75b [1] Reported-by: syzbot <syzbot+31d54c60c5b254d6f75b@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Tested-by: syzbot <syzbot+31d54c60c5b254d6f75b@syzkaller.appspotmail.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://lore.kernel.org/r/77b76ac8-2bee-6444-d26c-8c30858b8daa@i-love.sakura.ne.jp
2021-12-20	ath9k_htc: fix NULL pointer dereference at ath9k_htc_rxep()	Tetsuo Handa
	syzbot is reporting lockdep warning followed by kernel panic at ath9k_htc_rxep() [1], for ath9k_htc_rxep() depends on ath9k_rx_init() being already completed. Since ath9k_htc_rxep() is set by ath9k_htc_connect_svc(WMI_BEACON_SVC) from ath9k_init_htc_services(), it is possible that ath9k_htc_rxep() is called via timer interrupt before ath9k_rx_init() from ath9k_init_device() is called. Since we can't call ath9k_init_device() before ath9k_init_htc_services(), let's hold ath9k_htc_rxep() no-op until ath9k_rx_init() completes. Link: https://syzkaller.appspot.com/bug?extid=4d2d56175b934b9a7bf9 [1] Reported-by: syzbot <syzbot+4d2d56175b934b9a7bf9@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Tested-by: syzbot <syzbot+4d2d56175b934b9a7bf9@syzkaller.appspotmail.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://lore.kernel.org/r/2b88f416-b2cb-7a18-d688-951e6dc3fe92@i-love.sakura.ne.jp
2021-12-20	ath11k: fix warning of RCU usage for ath11k_mac_get_arvif_by_vdev_id()	Wen Gong
	When enable more debug config, it happen below warning. It is because the caller does not add rcu_read_lock()/rcu_read_unlock() to wrap the rcu_dereference(). Add rcu_read_lock()/rcu_read_unlock() to wrap rcu_dereference(), then fixed it. [ 180.716604] ============================= [ 180.716670] WARNING: suspicious RCU usage [ 180.716734] 5.16.0-rc4-wt-ath+ #542 Not tainted [ 180.716895] ----------------------------- [ 180.716957] drivers/net/wireless/ath/ath11k/mac.c:506 suspicious rcu_dereference_check() usage! [ 180.717023] other info that might help us debug this: [ 180.717087] rcu_scheduler_active = 2, debug_locks = 1 [ 180.717151] no locks held by swapper/0/0. [ 180.717215] stack backtrace: [ 180.717279] CPU: 0 PID: 0 Comm: swapper/0 Kdump: loaded Not tainted 5.16.0-rc4-wt-ath+ #542 [ 180.717346] Hardware name: Intel(R) Client Systems NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0067.2021.0528.1339 05/28/2021 [ 180.717411] Call Trace: [ 180.717475] <IRQ> [ 180.717541] dump_stack_lvl+0x57/0x7d [ 180.717610] ath11k_mac_get_arvif_by_vdev_id+0x1ab/0x2d0 [ath11k] [ 180.717694] ? ath11k_mac_get_arvif+0x140/0x140 [ath11k] [ 180.717798] ? ath11k_wmi_tlv_op_rx+0xc1b/0x2520 [ath11k] [ 180.717888] ? kfree+0xe8/0x2c0 [ 180.717959] ath11k_wmi_tlv_op_rx+0xc27/0x2520 [ath11k] [ 180.718038] ? ath11k_mgmt_rx_event+0xda0/0xda0 [ath11k] [ 180.718113] ? __lock_acquire+0xb72/0x1870 [ 180.718182] ? lockdep_hardirqs_on_prepare.part.0+0x18c/0x370 [ 180.718250] ? sched_clock_cpu+0x15/0x1b0 [ 180.718314] ? find_held_lock+0x33/0x110 [ 180.718381] ? __lock_release+0x4bd/0x9f0 [ 180.718447] ? lock_downgrade+0x130/0x130 [ 180.718517] ath11k_htc_rx_completion_handler+0x38f/0x5b0 [ath11k] [ 180.718596] ? __local_bh_enable_ip+0xa0/0x110 [ 180.718662] ath11k_ce_recv_process_cb+0x5ac/0x920 [ath11k] [ 180.718783] ? __lock_acquired+0x205/0x890 [ 180.718864] ? ath11k_ce_rx_post_pipe+0x970/0x970 [ath11k] [ 180.718949] ? __wake_up_bit+0x100/0x100 [ 180.719020] ath11k_pci_ce_tasklet+0x5f/0xf0 [ath11k_pci] [ 180.719085] ? tasklet_clear_sched+0x42/0xe0 [ 180.719148] tasklet_action_common.constprop.0+0x204/0x2f0 [ 180.719217] __do_softirq+0x276/0x86a [ 180.719281] ? __common_interrupt+0x92/0x1d0 [ 180.719350] __irq_exit_rcu+0x11c/0x180 [ 180.719418] irq_exit_rcu+0x5/0x20 [ 180.719482] common_interrupt+0xa4/0xc0 [ 180.719547] </IRQ> [ 180.719609] <TASK> [ 180.719671] asm_common_interrupt+0x1e/0x40 [ 180.719772] RIP: 0010:cpuidle_enter_state+0x1f3/0x8d0 [ 180.719838] Code: 00 41 8b 77 04 bf ff ff ff ff e8 78 f1 ff ff 31 ff e8 81 fa 52 fe 80 7c 24 08 00 0f 85 9e 01 00 00 e8 11 13 78 fe fb 45 85 e4 <0f> 88 8c 02 00 00 49 63 ec 48 8d 44 6d 00 48 8d 44 85 00 48 8d 7c [ 180.719909] RSP: 0018:ffffffffa4607dd0 EFLAGS: 00000202 [ 180.719982] RAX: 00000000002aea91 RBX: ffffffffa4a5fec0 RCX: 1ffffffff49ca501 [ 180.720047] RDX: 0000000000000000 RSI: ffffffffa3c6e4e0 RDI: ffffffffa3dcf2a0 [ 180.720110] RBP: 0000000000000002 R08: 0000000000000001 R09: ffffffffa4e54d17 [ 180.720173] R10: fffffbfff49ca9a2 R11: 0000000000000001 R12: 0000000000000002 [ 180.720236] R13: ffff8881169ccc04 R14: 0000002a13899598 R15: ffff8881169ccc00 [ 180.720321] cpuidle_enter+0x45/0xa0 [ 180.720413] cpuidle_idle_call+0x274/0x3f0 [ 180.720503] ? arch_cpu_idle_exit+0x30/0x30 [ 180.720869] ? tsc_verify_tsc_adjust+0x97/0x2e0 [ 180.720935] ? lockdep_hardirqs_off+0x90/0xd0 [ 180.721002] do_idle+0xe0/0x150 [ 180.721069] cpu_startup_entry+0x14/0x20 [ 180.721134] start_kernel+0x3a2/0x3c2 [ 180.721200] secondary_startup_64_no_verify+0xb0/0xbb [ 180.721274] </TASK> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-02892.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1 Signed-off-by: Wen Gong <quic_wgong@quicinc.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://lore.kernel.org/r/20211217064132.30911-1-quic_wgong@quicinc.com
2021-12-20	ath11k: add signal report to mac80211 for QCA6390 and WCN6855	Wen Gong
	IEEE80211_HW_USES_RSS is set in ath11k, then the device uses RSS and thus requires parallel RX which implies using per-CPU station statistics in sta_get_last_rx_stats() of mac80211. Currently signal is only set in ath11k_mgmt_rx_event(), and not set for RX data packet, then it show signal as 0 for iw command easily. Change to get signal from firmware and report to mac80211. For QCA6390 and WCN6855, the rssi value is already in dbm unit, so don't need to convert it again. Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-01740-QCAHSTSWPLZ_V2_TO_X86-1 Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1 Signed-off-by: Wen Gong <quic_wgong@quicinc.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://lore.kernel.org/r/20211216070535.31732-1-quic_wgong@quicinc.com
2021-12-20	ath11k: report rssi of each chain to mac80211 for QCA6390/WCN6855	Wen Gong
	Command "iw wls1 station dump" does not show each chain's rssi currently. If the rssi of each chain from mon status which parsed in function ath11k_hal_rx_parse_mon_status_tlv() is invalid, then ath11k send wmi cmd WMI_REQUEST_STATS_CMDID with flag WMI_REQUEST_RSSI_PER_CHAIN_STAT to firmware, and parse the rssi of chain in wmi WMI_UPDATE_STATS_EVENTID, then report them to mac80211. WMI_REQUEST_STATS_CMDID is only sent when CONFIG_ATH11K_DEBUGFS is set, it is only called by ath11k_mac_op_sta_statistics(). It does not effect performance and power consumption. Because after STATION connected to AP, it is only called every 6 seconds by NetworkManager in below stack. [ 797.005587] CPU: 0 PID: 701 Comm: NetworkManager Tainted: G W OE 5.13.0-rc6-wt-ath+ #2 [ 797.005596] Hardware name: LENOVO 418065C/418065C, BIOS 83ET63WW (1.33 ) 07/29/2011 [ 797.005600] RIP: 0010:ath11k_mac_op_sta_statistics+0x2f/0x1b0 [ath11k] [ 797.005644] Code: 41 56 41 55 4c 8d aa 58 01 00 00 41 54 55 48 89 d5 53 48 8b 82 58 01 00 00 48 89 cb 4c 8b 70 20 49 8b 06 4c 8b a0 90 08 00 00 <0f> 0b 48 8b 82 b8 01 00 00 48 ba 00 00 00 00 01 00 00 00 48 89 81 [ 797.005651] RSP: 0018:ffffb1fc80a4b890 EFLAGS: 00010282 [ 797.005658] RAX: ffff8a5726200000 RBX: ffffb1fc80a4b958 RCX: ffffb1fc80a4b958 [ 797.005664] RDX: ffff8a5726a609f0 RSI: ffff8a581247f598 RDI: ffff8a5702878800 [ 797.005668] RBP: ffff8a5726a609f0 R08: 0000000000000000 R09: 0000000000000000 [ 797.005672] R10: 0000000000000000 R11: 0000000000000007 R12: 02dd68024f75f480 [ 797.005676] R13: ffff8a5726a60b48 R14: ffff8a5702879f40 R15: ffff8a5726a60000 [ 797.005681] FS: 00007f632c52a380(0000) GS:ffff8a583a200000(0000) knlGS:0000000000000000 [ 797.005687] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 797.005692] CR2: 00007fb025d69000 CR3: 00000001124f6005 CR4: 00000000000606f0 [ 797.005698] Call Trace: [ 797.005710] sta_set_sinfo+0xa7/0xb80 [mac80211] [ 797.005820] ieee80211_get_station+0x50/0x70 [mac80211] [ 797.005925] nl80211_get_station+0xd1/0x200 [cfg80211] [ 797.006045] genl_family_rcv_msg_doit.isra.15+0x111/0x140 [ 797.006059] genl_rcv_msg+0xe6/0x1e0 [ 797.006065] ? nl80211_dump_station+0x220/0x220 [cfg80211] [ 797.006223] ? nl80211_send_station.isra.72+0xf50/0xf50 [cfg80211] [ 797.006348] ? genl_family_rcv_msg_doit.isra.15+0x140/0x140 [ 797.006355] netlink_rcv_skb+0xb9/0xf0 [ 797.006363] genl_rcv+0x24/0x40 [ 797.006369] netlink_unicast+0x18e/0x290 [ 797.006375] netlink_sendmsg+0x30f/0x450 [ 797.006382] sock_sendmsg+0x5b/0x60 [ 797.006393] ____sys_sendmsg+0x219/0x240 [ 797.006403] ? copy_msghdr_from_user+0x5c/0x90 [ 797.006413] ? ____sys_recvmsg+0xf5/0x190 [ 797.006422] ___sys_sendmsg+0x88/0xd0 [ 797.006432] ? copy_msghdr_from_user+0x5c/0x90 [ 797.006443] ? ___sys_recvmsg+0x9e/0xd0 [ 797.006454] ? __fget_files+0x58/0x90 [ 797.006461] ? __fget_light+0x2d/0x70 [ 797.006466] ? do_epoll_wait+0xce/0x720 [ 797.006476] ? __sys_sendmsg+0x63/0xa0 [ 797.006485] __sys_sendmsg+0x63/0xa0 [ 797.006497] do_syscall_64+0x3c/0xb0 [ 797.006509] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 797.006519] RIP: 0033:0x7f632d99912d [ 797.006526] Code: 28 89 54 24 1c 48 89 74 24 10 89 7c 24 08 e8 ca ee ff ff 8b 54 24 1c 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2f 44 89 c7 48 89 44 24 08 e8 fe ee ff ff 48 [ 797.006533] RSP: 002b:00007ffd80808c00 EFLAGS: 00000293 ORIG_RAX: 000000000000002e [ 797.006540] RAX: ffffffffffffffda RBX: 0000563dab99d840 RCX: 00007f632d99912d [ 797.006545] RDX: 0000000000000000 RSI: 00007ffd80808c50 RDI: 000000000000000b [ 797.006549] RBP: 00007ffd80808c50 R08: 0000000000000000 R09: 0000000000001000 [ 797.006552] R10: 0000563dab96f010 R11: 0000000000000293 R12: 0000563dab99d840 [ 797.006556] R13: 0000563dabbb28c0 R14: 00007f632dad4280 R15: 0000563dabab11c0 [ 797.006563] ---[ end trace c9dcf08920c9945c ]--- Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-01230-QCAHSTSWPLZ_V2_TO_X86-1 Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-02892.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1 Signed-off-by: Wen Gong <quic_wgong@quicinc.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://lore.kernel.org/r/20211215090944.19729-1-quic_wgong@quicinc.com
2021-12-20	ath5k: switch to rate table based lookup	Jonas Jelonek
	Switching from legacy usage of ieee80211_get_tx_rates() lookup to direct rate table lookup in struct ieee80211_sta->rates. The current rate control API allows drivers to directly get rates from ieee80211_sta->rates. ath5k is currently one of the legacy drivers that perform translation/merge with the internal rate table via ieee80211_get_tx_rates provided by rate control API. For our upcoming changes to rate control API and the implementation of transmit power control, this patch changes the behaviour. The call to ieee80211_get_tx_rates and subsequent calls are also avoided. ath5k now directly reads rates from sta->rates into its internal rate table. Cause ath5k does not rely on the rate array in SKB->CB, this is not considered anymore except for the first entry (used for probing). Tested this on a PCEngines ALIX with CMP9-GP miniPCI wifi card (Atheros AR5213A). Generated traffic between AP and multiple STAs before and after applying the patch and simultaneously measured throughput and captured rc_stats. Comparison resulted in same rate selection and no performance loss between both runs. Co-developed-by: Thomas Huehn <thomas.huehn@hs-nordhausen.de> Signed-off-by: Thomas Huehn <thomas.huehn@hs-nordhausen.de> Signed-off-by: Jonas Jelonek <jelonek.jonas@gmail.com> Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com> Link: https://lore.kernel.org/r/20211215215042.637-1-jelonek.jonas@gmail.com
2021-12-20	Merge branch 'xsa' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip	Linus Torvalds
	Merge xen fixes from Juergen Gross: "Fixes for two issues related to Xen and malicious guests: - Guest can force the netback driver to hog large amounts of memory - Denial of Service in other guests due to event storms" * 'xsa' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/netback: don't queue unlimited number of packages xen/netback: fix rx queue stall detection xen/console: harden hvc_xen against event channel storms xen/netfront: harden netfront against event channel storms xen/blkfront: harden blkfront against event channel storms
2021-12-20	parisc: Clear stale IIR value on instruction access rights trap	Helge Deller
	When a trap 7 (Instruction access rights) occurs, this means the CPU couldn't execute an instruction due to missing execute permissions on the memory region. In this case it seems the CPU didn't even fetched the instruction from memory and thus did not store it in the cr19 (IIR) register before calling the trap handler. So, the trap handler will find some random old stale value in cr19. This patch simply overwrites the stale IIR value with a constant magic "bad food" value (0xbaadf00d), in the hope people don't start to try to understand the various random IIR values in trap 7 dumps. Noticed-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
2021-12-20	KVM: selftests: Add test to verify TRIPLE_FAULT on invalid L2 guest state	Sean Christopherson
	Add a selftest to attempt to enter L2 with invalid guests state by exiting to userspace via I/O from L2, and then using KVM_SET_SREGS to set invalid guest state (marking TR unusable is arbitrary chosen for its relative simplicity). This is a regression test for a bug introduced by commit c8607e4a086f ("KVM: x86: nVMX: don't fail nested VM entry on invalid guest state if !from_vmentry"), which incorrectly set vmx->fail=true when L2 had invalid guest state and ultimately triggered a WARN due to nested_vmx_vmexit() seeing vmx->fail==true while attempting to synthesize a nested VM-Exit. The is also a functional test to verify that KVM sythesizes TRIPLE_FAULT for L2, which is somewhat arbitrary behavior, instead of emulating L2. KVM should never emulate L2 due to invalid guest state, as it's architecturally impossible for L1 to run an L2 guest with invalid state as nested VM-Enter should always fail, i.e. L1 needs to do the emulation. Stuffing state via KVM ioctl() is a non-architctural, out-of-band case, hence the TRIPLE_FAULT being rather arbitrary. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211207193006.120997-5-seanjc@google.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-20	KVM: VMX: Fix stale docs for kvm-intel.emulate_invalid_guest_state	Sean Christopherson
	Update the documentation for kvm-intel's emulate_invalid_guest_state to rectify the description of KVM's default behavior, and to document that the behavior and thus parameter only applies to L1. Fixes: a27685c33acc ("KVM: VMX: Emulate invalid guest state by default") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211207193006.120997-4-seanjc@google.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-20	KVM: nVMX: Synthesize TRIPLE_FAULT for L2 if emulation is required	Sean Christopherson
	Synthesize a triple fault if L2 guest state is invalid at the time of VM-Enter, which can happen if L1 modifies SMRAM or if userspace stuffs guest state via ioctls(), e.g. KVM_SET_SREGS. KVM should never emulate invalid guest state, since from L1's perspective, it's architecturally impossible for L2 to have invalid state while L2 is running in hardware. E.g. attempts to set CR0 or CR4 to unsupported values will either VM-Exit or #GP. Modifying vCPU state via RSM+SMRAM and ioctl() are the only paths that can trigger this scenario, as nested VM-Enter correctly rejects any attempt to enter L2 with invalid state. RSM is a straightforward case as (a) KVM follows AMD's SMRAM layout and behavior, and (b) Intel's SDM states that loading reserved CR0/CR4 bits via RSM results in shutdown, i.e. there is precedent for KVM's behavior. Following AMD's SMRAM layout is important as AMD's layout saves/restores the descriptor cache information, including CS.RPL and SS.RPL, and also defines all the fields relevant to invalid guest state as read-only, i.e. so long as the vCPU had valid state before the SMI, which is guaranteed for L2, RSM will generate valid state unless SMRAM was modified. Intel's layout saves/restores only the selector, which means that scenarios where the selector and cached RPL don't match, e.g. conforming code segments, would yield invalid guest state. Intel CPUs fudge around this issued by stuffing SS.RPL and CS.RPL on RSM. Per Intel's SDM on the "Default Treatment of RSM", paraphrasing for brevity: IF internal storage indicates that the [CPU was post-VMXON] THEN enter VMX operation (root or non-root); restore VMX-critical state as defined in Section 34.14.1; set to their fixed values any bits in CR0 and CR4 whose values must be fixed in VMX operation [unless coming from an unrestricted guest]; IF RFLAGS.VM = 0 AND (in VMX root operation OR the “unrestricted guest” VM-execution control is 0) THEN CS.RPL := SS.DPL; SS.RPL := SS.DPL; FI; restore current VMCS pointer; FI; Note that Intel CPUs also overwrite the fixed CR0/CR4 bits, whereas KVM will sythesize TRIPLE_FAULT in this scenario. KVM's behavior is allowed as both Intel and AMD define CR0/CR4 SMRAM fields as read-only, i.e. the only way for CR0 and/or CR4 to have illegal values is if they were modified by the L1 SMM handler, and Intel's SDM "SMRAM State Save Map" section states "modifying these registers will result in unpredictable behavior". KVM's ioctl() behavior is less straightforward. Because KVM allows ioctls() to be executed in any order, rejecting an ioctl() if it would result in invalid L2 guest state is not an option as KVM cannot know if a future ioctl() would resolve the invalid state, e.g. KVM_SET_SREGS, or drop the vCPU out of L2, e.g. KVM_SET_NESTED_STATE. Ideally, KVM would reject KVM_RUN if L2 contained invalid guest state, but that carries the risk of a false positive, e.g. if RSM loaded invalid guest state and KVM exited to userspace. Setting a flag/request to detect such a scenario is undesirable because (a) it's extremely unlikely to add value to KVM as a whole, and (b) KVM would need to consider ioctl() interactions with such a flag, e.g. if userspace migrated the vCPU while the flag were set. Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211207193006.120997-3-seanjc@google.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-20	KVM: VMX: Always clear vmx->fail on emulation_required	Sean Christopherson
	Revert a relatively recent change that set vmx->fail if the vCPU is in L2 and emulation_required is true, as that behavior is completely bogus. Setting vmx->fail and synthesizing a VM-Exit is contradictory and wrong: (a) it's impossible to have both a VM-Fail and VM-Exit (b) vmcs.EXIT_REASON is not modified on VM-Fail (c) emulation_required refers to guest state and guest state checks are always VM-Exits, not VM-Fails. For KVM specifically, emulation_required is handled before nested exits in __vmx_handle_exit(), thus setting vmx->fail has no immediate effect, i.e. KVM calls into handle_invalid_guest_state() and vmx->fail is ignored. Setting vmx->fail can ultimately result in a WARN in nested_vmx_vmexit() firing when tearing down the VM as KVM never expects vmx->fail to be set when L2 is active, KVM always reflects those errors into L1. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 21158 at arch/x86/kvm/vmx/nested.c:4548 nested_vmx_vmexit+0x16bd/0x17e0 arch/x86/kvm/vmx/nested.c:4547 Modules linked in: CPU: 0 PID: 21158 Comm: syz-executor.1 Not tainted 5.16.0-rc3-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:nested_vmx_vmexit+0x16bd/0x17e0 arch/x86/kvm/vmx/nested.c:4547 Code: <0f> 0b e9 2e f8 ff ff e8 57 b3 5d 00 0f 0b e9 00 f1 ff ff 89 e9 80 Call Trace: vmx_leave_nested arch/x86/kvm/vmx/nested.c:6220 [inline] nested_vmx_free_vcpu+0x83/0xc0 arch/x86/kvm/vmx/nested.c:330 vmx_free_vcpu+0x11f/0x2a0 arch/x86/kvm/vmx/vmx.c:6799 kvm_arch_vcpu_destroy+0x6b/0x240 arch/x86/kvm/x86.c:10989 kvm_vcpu_destroy+0x29/0x90 arch/x86/kvm/../../../virt/kvm/kvm_main.c:441 kvm_free_vcpus arch/x86/kvm/x86.c:11426 [inline] kvm_arch_destroy_vm+0x3ef/0x6b0 arch/x86/kvm/x86.c:11545 kvm_destroy_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:1189 [inline] kvm_put_kvm+0x751/0xe40 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1220 kvm_vcpu_release+0x53/0x60 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3489 __fput+0x3fc/0x870 fs/file_table.c:280 task_work_run+0x146/0x1c0 kernel/task_work.c:164 exit_task_work include/linux/task_work.h:32 [inline] do_exit+0x705/0x24f0 kernel/exit.c:832 do_group_exit+0x168/0x2d0 kernel/exit.c:929 get_signal+0x1740/0x2120 kernel/signal.c:2852 arch_do_signal_or_restart+0x9c/0x730 arch/x86/kernel/signal.c:868 handle_signal_work kernel/entry/common.c:148 [inline] exit_to_user_mode_loop kernel/entry/common.c:172 [inline] exit_to_user_mode_prepare+0x191/0x220 kernel/entry/common.c:207 __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline] syscall_exit_to_user_mode+0x2e/0x70 kernel/entry/common.c:300 do_syscall_64+0x53/0xd0 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: c8607e4a086f ("KVM: x86: nVMX: don't fail nested VM entry on invalid guest state if !from_vmentry") Reported-by: syzbot+f1d2136db9c80d4733e8@syzkaller.appspotmail.com Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211207193006.120997-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-20	selftests: KVM: Fix non-x86 compiling	Andrew Jones
	Attempting to compile on a non-x86 architecture fails with include/kvm_util.h: In function â€˜vm_compute_max_gfnâ€™: include/kvm_util.h:79:21: error: dereferencing pointer to incomplete type â€˜struct kvm_vmâ€™ return ((1ULL << vm->pa_bits) >> vm->page_shift) - 1; ^~ This is because the declaration of struct kvm_vm is in lib/kvm_util_internal.h as an effort to make it private to the test lib code. We can still provide arch specific functions, though, by making the generic function symbols weak. Do that to fix the compile error. Fixes: c8cc43c1eae2 ("selftests: KVM: avoid failures due to reserved HyperTransport region") Cc: stable@vger.kernel.org Signed-off-by: Andrew Jones <drjones@redhat.com> Message-Id: <20211214151842.848314-1-drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-20	KVM: x86: Always set kvm_run->if_flag	Marc Orr
	The kvm_run struct's if_flag is a part of the userspace/kernel API. The SEV-ES patches failed to set this flag because it's no longer needed by QEMU (according to the comment in the source code). However, other hypervisors may make use of this flag. Therefore, set the flag for guests with encrypted registers (i.e., with guest_state_protected set). Fixes: f1c6366e3043 ("KVM: SVM: Add required changes to support intercepts under SEV-ES") Signed-off-by: Marc Orr <marcorr@google.com> Message-Id: <20211209155257.128747-1-marcorr@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
2021-12-20	KVM: x86/mmu: Don't advance iterator after restart due to yielding	Sean Christopherson
	After dropping mmu_lock in the TDP MMU, restart the iterator during tdp_iter_next() and do not advance the iterator. Advancing the iterator results in skipping the top-level SPTE and all its children, which is fatal if any of the skipped SPTEs were not visited before yielding. When zapping all SPTEs, i.e. when min_level == root_level, restarting the iter and then invoking tdp_iter_next() is always fatal if the current gfn has as a valid SPTE, as advancing the iterator results in try_step_side() skipping the current gfn, which wasn't visited before yielding. Sprinkle WARNs on iter->yielded being true in various helpers that are often used in conjunction with yielding, and tag the helper with __must_check to reduce the probabily of improper usage. Failing to zap a top-level SPTE manifests in one of two ways. If a valid SPTE is skipped by both kvm_tdp_mmu_zap_all() and kvm_tdp_mmu_put_root(), the shadow page will be leaked and KVM will WARN accordingly. WARNING: CPU: 1 PID: 3509 at arch/x86/kvm/mmu/tdp_mmu.c:46 [kvm] RIP: 0010:kvm_mmu_uninit_tdp_mmu+0x3e/0x50 [kvm] Call Trace: <TASK> kvm_arch_destroy_vm+0x130/0x1b0 [kvm] kvm_destroy_vm+0x162/0x2a0 [kvm] kvm_vcpu_release+0x34/0x60 [kvm] __fput+0x82/0x240 task_work_run+0x5c/0x90 do_exit+0x364/0xa10 ? futex_unqueue+0x38/0x60 do_group_exit+0x33/0xa0 get_signal+0x155/0x850 arch_do_signal_or_restart+0xed/0x750 exit_to_user_mode_prepare+0xc5/0x120 syscall_exit_to_user_mode+0x1d/0x40 do_syscall_64+0x48/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae If kvm_tdp_mmu_zap_all() skips a gfn/SPTE but that SPTE is then zapped by kvm_tdp_mmu_put_root(), KVM triggers a use-after-free in the form of marking a struct page as dirty/accessed after it has been put back on the free list. This directly triggers a WARN due to encountering a page with page_count() == 0, but it can also lead to data corruption and additional errors in the kernel. WARNING: CPU: 7 PID: 1995658 at arch/x86/kvm/../../../virt/kvm/kvm_main.c:171 RIP: 0010:kvm_is_zone_device_pfn.part.0+0x9e/0xd0 [kvm] Call Trace: <TASK> kvm_set_pfn_dirty+0x120/0x1d0 [kvm] __handle_changed_spte+0x92e/0xca0 [kvm] __handle_changed_spte+0x63c/0xca0 [kvm] __handle_changed_spte+0x63c/0xca0 [kvm] __handle_changed_spte+0x63c/0xca0 [kvm] zap_gfn_range+0x549/0x620 [kvm] kvm_tdp_mmu_put_root+0x1b6/0x270 [kvm] mmu_free_root_page+0x219/0x2c0 [kvm] kvm_mmu_free_roots+0x1b4/0x4e0 [kvm] kvm_mmu_unload+0x1c/0xa0 [kvm] kvm_arch_destroy_vm+0x1f2/0x5c0 [kvm] kvm_put_kvm+0x3b1/0x8b0 [kvm] kvm_vcpu_release+0x4e/0x70 [kvm] __fput+0x1f7/0x8c0 task_work_run+0xf8/0x1a0 do_exit+0x97b/0x2230 do_group_exit+0xda/0x2a0 get_signal+0x3be/0x1e50 arch_do_signal_or_restart+0x244/0x17f0 exit_to_user_mode_prepare+0xcb/0x120 syscall_exit_to_user_mode+0x1d/0x40 do_syscall_64+0x4d/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae Note, the underlying bug existed even before commit 1af4a96025b3 ("KVM: x86/mmu: Yield in TDU MMU iter even if no SPTES changed") moved calls to tdp_mmu_iter_cond_resched() to the beginning of loops, as KVM could still incorrectly advance past a top-level entry when yielding on a lower-level entry. But with respect to leaking shadow pages, the bug was introduced by yielding before processing the current gfn. Alternatively, tdp_mmu_iter_cond_resched() could simply fall through, or callers could jump to their "retry" label. The downside of that approach is that tdp_mmu_iter_cond_resched() _must_ be called before anything else in the loop, and there's no easy way to enfornce that requirement. Ideally, KVM would handling the cond_resched() fully within the iterator macro (the code is actually quite clean) and avoid this entire class of bugs, but that is extremely difficult do while also supporting yielding after tdp_mmu_set_spte_atomic() fails. Yielding after failing to set a SPTE is very desirable as the "owner" of the REMOVED_SPTE isn't strictly bounded, e.g. if it's zapping a high-level shadow page, the REMOVED_SPTE may block operations on the SPTE for a significant amount of time. Fixes: faaf05b00aec ("kvm: x86/mmu: Support zapping SPTEs in the TDP MMU") Fixes: 1af4a96025b3 ("KVM: x86/mmu: Yield in TDU MMU iter even if no SPTES changed") Reported-by: Ignat Korchagin <ignat@cloudflare.com> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211214033528.123268-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-20	drm/i915/guc: Only assign guc_id.id when stealing guc_id	Matthew Brost
	Previously assigned whole guc_id structure (list, spin lock) which is incorrect, only assign the guc_id.id. Fixes: 0f7976506de61 ("drm/i915/guc: Rework and simplify locking") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211214170500.28569-3-matthew.brost@intel.com (cherry picked from commit 939d8e9c87e704fd5437e2c8b80929591fe540eb) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2021-12-20	drm/i915/guc: Use correct context lock when callig clr_context_registered	Matthew Brost
	s/ce/cn/ when grabbing guc_state.lock before calling clr_context_registered. Fixes: 0f7976506de61 ("drm/i915/guc: Rework and simplify locking") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211214170500.28569-2-matthew.brost@intel.com (cherry picked from commit b25db8c782ad7ae80d4cea2a09c222f4f8980bb9) Signed-off-by: Jani Nikula <jani.nikula@intel.com>