summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-11-03Merge tag 'wireless-2022-11-03' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Kalle Valo says: ==================== wireless fixes for v6.1 Second set of fixes for v6.1. Some fixes to char type usage in drivers, memory leaks in the stack and also functionality fixes. The rt2x00 char type fix is a larger (but still simple) commit, otherwise the fixes are small in size. * tag 'wireless-2022-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: ath11k: avoid deadlock during regulatory update in ath11k_regd_update() wifi: ath11k: Fix QCN9074 firmware boot on x86 wifi: mac80211: Set TWT Information Frame Disabled bit as 1 wifi: mac80211: Fix ack frame idr leak when mesh has no route wifi: mac80211: fix general-protection-fault in ieee80211_subif_start_xmit() wifi: brcmfmac: Fix potential buffer overflow in brcmf_fweh_event_worker() wifi: airo: do not assign -1 to unsigned char wifi: mac80211_hwsim: fix debugfs attribute ps with rc table support wifi: cfg80211: Fix bitrates overflow issue wifi: cfg80211: fix memory leak in query_regdb_file() wifi: mac80211: fix memory free error when registering wiphy fail wifi: cfg80211: silence a sparse RCU warning wifi: rt2x00: use explicitly signed or unsigned types ==================== Link: https://lore.kernel.org/r/20221103125315.04E57C433C1@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03net: gso: fix panic on frag_list with mixed head alloc typesJiri Benc
Since commit 3dcbdb134f32 ("net: gso: Fix skb_segment splat when splitting gso_size mangled skb having linear-headed frag_list"), it is allowed to change gso_size of a GRO packet. However, that commit assumes that "checking the first list_skb member suffices; i.e if either of the list_skb members have non head_frag head, then the first one has too". It turns out this assumption does not hold. We've seen BUG_ON being hit in skb_segment when skbs on the frag_list had differing head_frag with the vmxnet3 driver. This happens because __netdev_alloc_skb and __napi_alloc_skb can return a skb that is page backed or kmalloced depending on the requested size. As the result, the last small skb in the GRO packet can be kmalloced. There are three different locations where this can be fixed: (1) We could check head_frag in GRO and not allow GROing skbs with different head_frag. However, that would lead to performance regression on normal forward paths with unmodified gso_size, where !head_frag in the last packet is not a problem. (2) Set a flag in bpf_skb_net_grow and bpf_skb_net_shrink indicating that NETIF_F_SG is undesirable. That would need to eat a bit in sk_buff. Furthermore, that flag can be unset when all skbs on the frag_list are page backed. To retain good performance, bpf_skb_net_grow/shrink would have to walk the frag_list. (3) Walk the frag_list in skb_segment when determining whether NETIF_F_SG should be cleared. This of course slows things down. This patch implements (3). To limit the performance impact in skb_segment, the list is walked only for skbs with SKB_GSO_DODGY set that have gso_size changed. Normal paths thus will not hit it. We could check only the last skb but since we need to walk the whole list anyway, let's stay on the safe side. Fixes: 3dcbdb134f32 ("net: gso: Fix skb_segment splat when splitting gso_size mangled skb having linear-headed frag_list") Signed-off-by: Jiri Benc <jbenc@redhat.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/e04426a6a91baf4d1081e1b478c82b5de25fdf21.1667407944.git.jbenc@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== bpf 2022-11-04 We've added 8 non-merge commits during the last 3 day(s) which contain a total of 10 files changed, 113 insertions(+), 16 deletions(-). The main changes are: 1) Fix memory leak upon allocation failure in BPF verifier's stack state tracking, from Kees Cook. 2) Fix address leakage when BPF progs release reference to an object, from Youlin Li. 3) Fix BPF CI breakage from buggy in.h uapi header dependency, from Andrii Nakryiko. 4) Fix bpftool pin sub-command's argument parsing, from Pu Lehui. 5) Fix BPF sockmap lockdep warning by cancelling psock work outside of socket lock, from Cong Wang. 6) Follow-up for BPF sockmap to fix sk_forward_alloc accounting, from Wang Yufen. bpf-for-netdev * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: Add verifier test for release_reference() bpf: Fix wrong reg type conversion in release_reference() bpf, sock_map: Move cancel_work_sync() out of sock lock tools/headers: Pull in stddef.h to uapi to fix BPF selftests build in CI net/ipv4: Fix linux/in.h header dependencies bpftool: Fix NULL pointer dereference when pin {PROG, MAP, LINK} without FILE bpf, sockmap: Fix the sk->sk_forward_alloc warning of sk_stream_kill_queues bpf, verifier: Fix memory leak in array reallocation for stack state ==================== Link: https://lore.kernel.org/r/20221104000445.30761-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-04tracing: kprobe: Fix memory leak in test_gen_kprobe/kretprobe_cmd()Shang XiaoJing
test_gen_kprobe_cmd() only free buf in fail path, hence buf will leak when there is no failure. Move kfree(buf) from fail path to common path to prevent the memleak. The same reason and solution in test_gen_kretprobe_cmd(). unreferenced object 0xffff888143b14000 (size 2048): comm "insmod", pid 52490, jiffies 4301890980 (age 40.553s) hex dump (first 32 bytes): 70 3a 6b 70 72 6f 62 65 73 2f 67 65 6e 5f 6b 70 p:kprobes/gen_kp 72 6f 62 65 5f 74 65 73 74 20 64 6f 5f 73 79 73 robe_test do_sys backtrace: [<000000006d7b836b>] kmalloc_trace+0x27/0xa0 [<0000000009528b5b>] 0xffffffffa059006f [<000000008408b580>] do_one_initcall+0x87/0x2a0 [<00000000c4980a7e>] do_init_module+0xdf/0x320 [<00000000d775aad0>] load_module+0x3006/0x3390 [<00000000e9a74b80>] __do_sys_finit_module+0x113/0x1b0 [<000000003726480d>] do_syscall_64+0x35/0x80 [<000000003441e93b>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 Link: https://lore.kernel.org/all/20221102072954.26555-1-shangxiaojing@huawei.com/ Fixes: 64836248dda2 ("tracing: Add kprobe event command generation test module") Cc: stable@vger.kernel.org Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2022-11-04tracing/fprobe: Fix to check whether fprobe is registered correctlyMasami Hiramatsu (Google)
Since commit ab51e15d535e ("fprobe: Introduce FPROBE_FL_KPROBE_SHARED flag for fprobe") introduced fprobe_kprobe_handler() for fprobe::ops::func, unregister_fprobe() fails to unregister the registered if user specifies FPROBE_FL_KPROBE_SHARED flag. Moreover, __register_ftrace_function() is possible to change the ftrace_ops::func, thus we have to check fprobe::ops::saved_func instead. To check it correctly, it should confirm the fprobe::ops::saved_func is either fprobe_handler() or fprobe_kprobe_handler(). Link: https://lore.kernel.org/all/166677683946.1459107.15997653945538644683.stgit@devnote3/ Fixes: cad9931f64dc ("fprobe: Add ftrace based probe APIs") Cc: stable@vger.kernel.org Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2022-11-04fprobe: Check rethook_alloc() return in rethook initializationRafael Mendonca
Check if fp->rethook succeeded to be allocated. Otherwise, if rethook_alloc() fails, then we end up dereferencing a NULL pointer in rethook_add_node(). Link: https://lore.kernel.org/all/20221025031209.954836-1-rafaelmendsr@gmail.com/ Fixes: 5b0ab78998e3 ("fprobe: Add exit_handler support") Cc: stable@vger.kernel.org Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2022-11-04kprobe: reverse kp->flags when arm_kprobe failedLi Qiang
In aggregate kprobe case, when arm_kprobe failed, we need set the kp->flags with KPROBE_FLAG_DISABLED again. If not, the 'kp' kprobe will been considered as enabled but it actually not enabled. Link: https://lore.kernel.org/all/20220902155820.34755-1-liq3ea@163.com/ Fixes: 12310e343755 ("kprobes: Propagate error from arm_kprobe_ftrace()") Cc: stable@vger.kernel.org Signed-off-by: Li Qiang <liq3ea@163.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2022-11-03Merge tag 'ata-6.1-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata Pull ata fixes from Damien Le Moal: "Two driver fixes: - Fix the PIO mode configuration of the pdc20230 (pata_legacy) driver. This also removes a compilation warning with clang and W=1 (Sergey) - Fix devm_platform_ioremap_resource() return value check in the palmld driver (Yang)" * tag 'ata-6.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: ata: palmld: fix return value check in palmld_pata_probe() ata: pata_legacy: fix pdc20230_set_piomode()
2022-11-04Merge tag 'drm-intel-fixes-2022-11-03' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes - Add locking around DKL PHY register accesses (Imre Deak) - Stop abusing swiotlb_max_segment (Robert Beckett) - Filter out invalid outputs more sensibly (Ville Syrjälä) - Setup DDC fully before output init (Ville Syrjälä) - Simplify intel_panel_add_edid_alt_fixed_modes() (Ville Syrjälä) - Grab mode_config.mutex during LVDS init to avoid WARNs (Ville Syrjälä) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Y2ODlCGM4nACmzsJ@tursulin-desk
2022-11-04selftests/bpf: Add verifier test for release_reference()Youlin Li
Add a test case to ensure that released pointer registers will not be leaked into the map. Before fix: ./test_verifier 984 984/u reference tracking: try to leak released ptr reg FAIL Unexpected success to load! verification time 67 usec stack depth 4 processed 23 insns (limit 1000000) max_states_per_insn 0 total_states 2 peak_states 2 mark_read 1 984/p reference tracking: try to leak released ptr reg OK Summary: 1 PASSED, 0 SKIPPED, 1 FAILED After fix: ./test_verifier 984 984/u reference tracking: try to leak released ptr reg OK 984/p reference tracking: try to leak released ptr reg OK Summary: 2 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Youlin Li <liulin063@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20221103093440.3161-2-liulin063@gmail.com
2022-11-04bpf: Fix wrong reg type conversion in release_reference()Youlin Li
Some helper functions will allocate memory. To avoid memory leaks, the verifier requires the eBPF program to release these memories by calling the corresponding helper functions. When a resource is released, all pointer registers corresponding to the resource should be invalidated. The verifier use release_references() to do this job, by apply __mark_reg_unknown() to each relevant register. It will give these registers the type of SCALAR_VALUE. A register that will contain a pointer value at runtime, but of type SCALAR_VALUE, which may allow the unprivileged user to get a kernel pointer by storing this register into a map. Using __mark_reg_not_init() while NOT allow_ptr_leaks can mitigate this problem. Fixes: fd978bf7fd31 ("bpf: Add reference tracking to verifier") Signed-off-by: Youlin Li <liulin063@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20221103093440.3161-1-liulin063@gmail.com
2022-11-04Merge tag 'amd-drm-fixes-6.1-2022-11-02' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.1-2022-11-02: amdgpu: - DCN 3.1.4 fixes - DCN 3.2.x fixes - GC 11.x fixes - Virtual display fix - Fail suspend if resources can't be evicted - SR-IOV fix - Display PSR fix amdkfd: - Fix possible NULL pointer deref - GC 11.x trap handler fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221103023257.10446-1-alexander.deucher@amd.com
2022-11-03Merge tag 'fuse-fixes-6.1-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fixes from Miklos Szeredi: "Fix two rarely triggered but long-standing issues" * tag 'fuse-fixes-6.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: add file_modified() to fallocate fuse: fix readdir cache race
2022-11-03Input: goodix - try resetting the controller when no config is setHans de Goede
On ACPI systems (irq_pin_access_method == IRQ_PIN_ACCESS_ACPI_*) the driver does not reset the controller at probe time, because sometimes the system firmware loads a config and resetting might loose this config. On the Nanote UMPC-01 device OTOH the config is in flash of the controller, the controller needs a reset to load this; and the system firmware does not reset the controller on a cold boot. To fix the Nanote UMPC-01 touchscreen not working on a cold boot, try resetting the controller and then re-reading the config when encountering a config with 0 width/height/max_touch_num value and the controller has not already been reset by goodix_ts_probe(). This should be safe to do in general because normally we should never encounter a config with 0 width/height/max_touch_num. Doing this in general not only avoids the need for a DMI quirk, but also might help other systems. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Bastien Nocera <hadess@hadess.net> Link: https://lore.kernel.org/r/20221025122930.421377-2-hdegoede@redhat.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2022-11-03cxl/pmem: Use size_add() against integer overflowYu Zhe
"struct_size() + n" may cause a integer overflow, use size_add() to handle it. Signed-off-by: Yu Zhe <yuzhe@nfschina.com> Link: https://lore.kernel.org/r/20220927070247.23148-1-yuzhe@nfschina.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-11-03Merge tag 'for-6.1-rc3-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "A batch of error handling fixes for resource leaks, fixes for nowait mode in combination with direct and buffered IO: - direct IO + dsync + nowait could miss a sync of the file after write, add handling for this combination - buffered IO + nowait should not fail with ENOSPC, only blocking IO could determine that - error handling fixes: - fix inode reserve space leak due to nowait buffered write - check the correct variable after allocation (direct IO submit) - fix inode list leak during backref walking - fix ulist freeing in self tests" * tag 'for-6.1-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix inode reserve space leak due to nowait buffered write btrfs: fix nowait buffered write returning -ENOSPC btrfs: remove pointless and double ulist frees in error paths of qgroup tests btrfs: fix ulist leaks in error paths of qgroup self tests btrfs: fix inode list leak during backref walking at find_parent_nodes() btrfs: fix inode list leak during backref walking at resolve_indirect_refs() btrfs: fix lost file sync on direct IO write with nowait and dsync iocb btrfs: fix a memory allocation failure test in btrfs_submit_direct
2022-11-03arm64: cpufeature: Fix the visibility of compat hwcapsAmit Daniel Kachhap
Commit 237405ebef58 ("arm64: cpufeature: Force HWCAP to be based on the sysreg visible to user-space") forced the hwcaps to use sanitised user-space view of the id registers. However, the ID register structures used to select few compat cpufeatures (vfp, crc32, ...) are masked and hence such hwcaps do not appear in /proc/cpuinfo anymore for PER_LINUX32 personality. Add the ID register structures explicitly and set the relevant entry as visible. As these ID registers are now of type visible so make them available in 64-bit userspace by making necessary changes in register emulation logic and documentation. While at it, update the comment for structure ftr_generic_32bits[] which lists the ID register that use it. Fixes: 237405ebef58 ("arm64: cpufeature: Force HWCAP to be based on the sysreg visible to user-space") Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: James Morse <james.morse@arm.com> Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Link: https://lore.kernel.org/r/20221103082232.19189-1-amit.kachhap@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-11-03Merge tag 'linux-kselftest-fixes-6.1-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest fixes from Shuah Khan: "Fixes to the pidfd test" * tag 'linux-kselftest-fixes-6.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/pidfd_test: Remove the erroneous ',' selftests: pidfd: Fix compling warnings ksefltests: pidfd: Fix wait_states: Test terminated by timeout
2022-11-03Merge tag 'net-6.1-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bluetooth and netfilter. Current release - regressions: - net: several zerocopy flags fixes - netfilter: fix possible memory leak in nf_nat_init() - openvswitch: add missing .resv_start_op Previous releases - regressions: - neigh: fix null-ptr-deref in neigh_table_clear() - sched: fix use after free in red_enqueue() - dsa: fall back to default tagger if we can't load the one from DT - bluetooth: fix use-after-free in l2cap_conn_del() Previous releases - always broken: - netfilter: netlink notifier might race to release objects - nfc: fix potential memory leak of skb - bluetooth: fix use-after-free caused by l2cap_reassemble_sdu - bluetooth: use skb_put to set length - eth: tun: fix bugs for oversize packet when napi frags enabled - eth: lan966x: fixes for when MTU is changed - eth: dwmac-loongson: fix invalid mdio_node" * tag 'net-6.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (53 commits) vsock: fix possible infinite sleep in vsock_connectible_wait_data() vsock: remove the unused 'wait' in vsock_connectible_recvmsg() ipv6: fix WARNING in ip6_route_net_exit_late() bridge: Fix flushing of dynamic FDB entries net, neigh: Fix null-ptr-deref in neigh_table_clear() net/smc: Fix possible leaked pernet namespace in smc_init() stmmac: dwmac-loongson: fix invalid mdio_node ibmvnic: Free rwi on reset success net: mdio: fix undefined behavior in bit shift for __mdiobus_register Bluetooth: L2CAP: Fix attempting to access uninitialized memory Bluetooth: L2CAP: Fix l2cap_global_chan_by_psm Bluetooth: L2CAP: Fix accepting connection request for invalid SPSM Bluetooth: hci_conn: Fix not restoring ISO buffer count on disconnect Bluetooth: L2CAP: Fix memory leak in vhci_write Bluetooth: L2CAP: fix use-after-free in l2cap_conn_del() Bluetooth: virtio_bt: Use skb_put to set length Bluetooth: hci_conn: Fix CIS connection dst_type handling Bluetooth: L2CAP: Fix use-after-free caused by l2cap_reassemble_sdu netfilter: ipset: enforce documented limit to prevent allocating huge memory isdn: mISDN: netjet: fix wrong check of device registration ...
2022-11-03Merge tag 'powerpc-6.1-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix an endian thinko in the asm-generic compat_arg_u64() which led to syscall arguments being swapped for some compat syscalls. - Fix syscall wrapper handling of syscalls with 64-bit arguments on 32-bit kernels, which led to syscall arguments being misplaced. - A build fix for amdgpu on Book3E with AltiVec disabled. Thanks to Andreas Schwab, Christian Zigotzky, and Arnd Bergmann. * tag 'powerpc-6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/32: Select ARCH_SPLIT_ARG64 powerpc/32: fix syscall wrappers with 64-bit arguments asm-generic: compat: fix compat_arg_u64() and compat_arg_u64_dual() powerpc/64e: Fix amdgpu build on Book3E w/o AltiVec
2022-11-03mm/slab_common: repair kernel-doc for __ksize()Lukas Bulwahn
Commit 445d41d7a7c1 ("Merge branch 'slab/for-6.1/kmalloc_size_roundup' into slab/for-next") resolved a conflict of two concurrent changes to __ksize(). However, it did not adjust the kernel-doc comment of __ksize(), while the name of the argument to __ksize() was renamed. Hence, ./scripts/ kernel-doc -none mm/slab_common.c warns about it. Adjust the kernel-doc comment for __ksize() for make W=1 happiness. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2022-11-03arm64: efi: Recover from synchronous exceptions occurring in firmwareArd Biesheuvel
Unlike x86, which has machinery to deal with page faults that occur during the execution of EFI runtime services, arm64 has nothing like that, and a synchronous exception raised by firmware code brings down the whole system. With more EFI based systems appearing that were not built to run Linux (such as the Windows-on-ARM laptops based on Qualcomm SOCs), as well as the introduction of PRM (platform specific firmware routines that are callable just like EFI runtime services), we are more likely to run into issues of this sort, and it is much more likely that we can identify and work around such issues if they don't bring down the system entirely. Since we already use a EFI runtime services call wrapper in assembler, we can quite easily add some code that captures the execution state at the point where the call is made, allowing us to revert to this state and proceed execution if the call triggered a synchronous exception. Given that the kernel and the firmware don't share any data structures that could end up in an indeterminate state, we can happily continue running, as long as we mark the EFI runtime services as unavailable from that point on. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
2022-11-03PCI: hv: Fix the definition of vector in hv_compose_msi_msg()Dexuan Cui
The local variable 'vector' must be u32 rather than u8: see the struct hv_msi_desc3. 'vector_count' should be u16 rather than u8: see struct hv_msi_desc, hv_msi_desc2 and hv_msi_desc3. Fixes: a2bad844a67b ("PCI: hv: Fix interrupt mapping for multi-MSI") Signed-off-by: Dexuan Cui <decui@microsoft.com> Cc: Jeffrey Hugo <quic_jhugo@quicinc.com> Cc: Carl Vanderlip <quic_carlv@quicinc.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Link: https://lore.kernel.org/r/20221027205256.17678-1-decui@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-11-03MAINTAINERS: remove sthemminStephen Hemminger
Leaving Microsoft, the Hyper-V drivers have lots of other maintainers. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Link: https://lore.kernel.org/r/20221028153741.25470-1-stephen@networkplumber.org Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-11-03x86/hyperv: fix invalid writes to MSRs during root partition kexecAnirudh Rayabharam
hyperv_cleanup resets the hypercall page by setting the MSR to 0. However, the root partition is not allowed to write to the GPA bits of the MSR. Instead, it uses the hypercall page provided by the MSR. Similar is the case with the reference TSC MSR. Clear only the enable bit instead of zeroing the entire MSR to make the code valid for root partition too. Signed-off-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20221027095729.1676394-3-anrayabh@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-11-03clocksource/drivers/hyperv: add data structure for reference TSC MSRAnirudh Rayabharam
Add a data structure to represent the reference TSC MSR similar to other MSRs. This simplifies the code for updating the MSR. Signed-off-by: Anirudh Rayabharam <anrayabh@linux.microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20221027095729.1676394-2-anrayabh@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-11-03Merge tag 'for-joerg' of ↵Joerg Roedel
git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd into core iommu: Define EINVAL as device/domain incompatibility This series is to replace the previous EMEDIUMTYPE patch in a VFIO series: https://lore.kernel.org/kvm/Yxnt9uQTmbqul5lf@8bytes.org/ The purpose is to regulate all existing ->attach_dev callback functions to use EINVAL exclusively for an incompatibility error between a device and a domain. This allows VFIO and IOMMUFD to detect such a soft error, and then try a different domain with the same device. Among all the patches, the first two are preparatory changes. And then one patch to update kdocs and another three patches for the enforcement effort. Link: https://lore.kernel.org/r/cover.1666042872.git.nicolinc@nvidia.com
2022-11-03iommu: Rename iommu-sva-lib.{c,h}Lu Baolu
Rename iommu-sva-lib.c[h] to iommu-sva.c[h] as it contains all code for SVA implementation in iommu core. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Tested-by: Tony Zhu <tony.zhu@intel.com> Link: https://lore.kernel.org/r/20221031005917.45690-14-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-03iommu: Per-domain I/O page fault handlingLu Baolu
Tweak the I/O page fault handling framework to route the page faults to the domain and call the page fault handler retrieved from the domain. This makes the I/O page fault handling framework possible to serve more usage scenarios as long as they have an IOMMU domain and install a page fault handler in it. Some unused functions are also removed to avoid dead code. The iommu_get_domain_for_dev_pasid() which retrieves attached domain for a {device, PASID} pair is used. It will be used by the page fault handling framework which knows {device, PASID} reported from the iommu driver. We have a guarantee that the SVA domain doesn't go away during IOPF handling, because unbind() won't free the domain until all the pending page requests have been flushed from the pipeline. The drivers either call iopf_queue_flush_dev() explicitly, or in stall case, the device driver is required to flush all DMAs including stalled transactions before calling unbind(). This also renames iopf_handle_group() to iopf_handler() to avoid confusing. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Tested-by: Tony Zhu <tony.zhu@intel.com> Link: https://lore.kernel.org/r/20221031005917.45690-13-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-03iommu: Prepare IOMMU domain for IOPFLu Baolu
This adds some mechanisms around the iommu_domain so that the I/O page fault handling framework could route a page fault to the domain and call the fault handler from it. Add pointers to the page fault handler and its private data in struct iommu_domain. The fault handler will be called with the private data as a parameter once a page fault is routed to the domain. Any kernel component which owns an iommu domain could install handler and its private parameter so that the page fault could be further routed and handled. This also prepares the SVA implementation to be the first consumer of the per-domain page fault handling model. The I/O page fault handler for SVA is copied to the SVA file with mmget_not_zero() added before mmap_read_lock(). Suggested-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Tested-by: Tony Zhu <tony.zhu@intel.com> Link: https://lore.kernel.org/r/20221031005917.45690-12-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-03iommu: Remove SVA related callbacks from iommu opsLu Baolu
These ops'es have been deprecated. There's no need for them anymore. Remove them to avoid dead code. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Tested-by: Tony Zhu <tony.zhu@intel.com> Link: https://lore.kernel.org/r/20221031005917.45690-11-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-03iommu/sva: Refactoring iommu_sva_bind/unbind_device()Lu Baolu
The existing iommu SVA interfaces are implemented by calling the SVA specific iommu ops provided by the IOMMU drivers. There's no need for any SVA specific ops in iommu_ops vector anymore as we can achieve this through the generic attach/detach_dev_pasid domain ops. This refactors the IOMMU SVA interfaces implementation by using the iommu_attach/detach_device_pasid interfaces and align them with the concept of the SVA iommu domain. Put the new SVA code in the SVA related file in order to make it self-contained. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Tested-by: Tony Zhu <tony.zhu@intel.com> Link: https://lore.kernel.org/r/20221031005917.45690-10-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-03arm-smmu-v3/sva: Add SVA domain supportLu Baolu
Add support for SVA domain allocation and provide an SVA-specific iommu_domain_ops. This implementation is based on the existing SVA code. Possible cleanup and refactoring are left for incremental changes later. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Link: https://lore.kernel.org/r/20221031005917.45690-9-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-03iommu/vt-d: Add SVA domain supportLu Baolu
Add support for SVA domain allocation and provide an SVA-specific iommu_domain_ops. This implementation is based on the existing SVA code. Possible cleanup and refactoring are left for incremental changes later. The VT-d driver will also need to support setting a DMA domain to a PASID of device. Current SVA implementation uses different data structures to track the domain and device PASID relationship. That's the reason why we need to check the domain type in remove_dev_pasid callback. Eventually we'll consolidate the data structures and remove the need of domain type check. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Tony Zhu <tony.zhu@intel.com> Link: https://lore.kernel.org/r/20221031005917.45690-8-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-03iommu: Add IOMMU SVA domain supportLu Baolu
The SVA iommu_domain represents a hardware pagetable that the IOMMU hardware could use for SVA translation. This adds some infrastructures to support SVA domain in the iommu core. It includes: - Extend the iommu_domain to support a new IOMMU_DOMAIN_SVA domain type. The IOMMU drivers that support allocation of the SVA domain should provide its own SVA domain specific iommu_domain_ops. - Add a helper to allocate an SVA domain. The iommu_domain_free() is still used to free an SVA domain. The report_iommu_fault() should be replaced by the new iommu_report_device_fault(). Leave the existing fault handler with the existing users and the newly added SVA members excludes it. Suggested-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Tested-by: Tony Zhu <tony.zhu@intel.com> Link: https://lore.kernel.org/r/20221031005917.45690-7-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-03iommu: Add attach/detach_dev_pasid iommu interfacesLu Baolu
Attaching an IOMMU domain to a PASID of a device is a generic operation for modern IOMMU drivers which support PASID-granular DMA address translation. Currently visible usage scenarios include (but not limited): - SVA (Shared Virtual Address) - kernel DMA with PASID - hardware-assist mediated device This adds the set_dev_pasid domain ops for setting the domain onto a PASID of a device and remove_dev_pasid iommu ops for removing any setup on a PASID of device. This also adds interfaces for device drivers to attach/detach/retrieve a domain for a PASID of a device. If multiple devices share a single group, it's fine as long the fabric always routes every TLP marked with a PASID to the host bridge and only the host bridge. For example, ACS achieves this universally and has been checked when pci_enable_pasid() is called. As we can't reliably tell the source apart in a group, all the devices in a group have to be considered as the same source, and mapped to the same PASID table. The DMA ownership is about the whole device (more precisely, iommu group), including the RID and PASIDs. When the ownership is converted, the pasid array must be empty. This also adds necessary checks in the DMA ownership interfaces. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Tested-by: Tony Zhu <tony.zhu@intel.com> Link: https://lore.kernel.org/r/20221031005917.45690-6-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-03PCI: Enable PASID only when ACS RR & UF enabled on upstream pathLu Baolu
The Requester ID/Process Address Space ID (PASID) combination identifies an address space distinct from the PCI bus address space, e.g., an address space defined by an IOMMU. But the PCIe fabric routes Memory Requests based on the TLP address, ignoring any PASID (PCIe r6.0, sec 2.2.10.4), so a TLP with PASID that SHOULD go upstream to the IOMMU may instead be routed as a P2P Request if its address falls in a bridge window. To ensure that all Memory Requests with PASID are routed upstream, only enable PASID if ACS P2P Request Redirect and Upstream Forwarding are enabled for the path leading to the device. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Suggested-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Tony Zhu <tony.zhu@intel.com> Link: https://lore.kernel.org/r/20221031005917.45690-5-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-03iommu: Remove SVM_FLAG_SUPERVISOR_MODE supportLu Baolu
The current kernel DMA with PASID support is based on the SVA with a flag SVM_FLAG_SUPERVISOR_MODE. The IOMMU driver binds the kernel memory address space to a PASID of the device. The device driver programs the device with kernel virtual address (KVA) for DMA access. There have been security and functional issues with this approach: - The lack of IOTLB synchronization upon kernel page table updates. (vmalloc, module/BPF loading, CONFIG_DEBUG_PAGEALLOC etc.) - Other than slight more protection, using kernel virtual address (KVA) has little advantage over physical address. There are also no use cases yet where DMA engines need kernel virtual addresses for in-kernel DMA. This removes SVM_FLAG_SUPERVISOR_MODE support from the IOMMU interface. The device drivers are suggested to handle kernel DMA with PASID through the kernel DMA APIs. The drvdata parameter in iommu_sva_bind_device() and all callbacks is not needed anymore. Cleanup them as well. Link: https://lore.kernel.org/linux-iommu/20210511194726.GP1002214@nvidia.com/ Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Fenghua Yu <fenghua.yu@intel.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Tested-by: Tony Zhu <tony.zhu@intel.com> Link: https://lore.kernel.org/r/20221031005917.45690-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-03iommu: Add max_pasids field in struct dev_iommuLu Baolu
Use this field to save the number of PASIDs that a device is able to consume. It is a generic attribute of a device and lifting it into the per-device dev_iommu struct could help to avoid the boilerplate code in various IOMMU drivers. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Tested-by: Tony Zhu <tony.zhu@intel.com> Link: https://lore.kernel.org/r/20221031005917.45690-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-03iommu: Add max_pasids field in struct iommu_deviceLu Baolu
Use this field to keep the number of supported PASIDs that an IOMMU hardware is able to support. This is a generic attribute of an IOMMU and lifting it into the per-IOMMU device structure makes it possible to allocate a PASID for device without calls into the IOMMU drivers. Any iommu driver that supports PASID related features should set this field before enabling them on the devices. In the Intel IOMMU driver, intel_iommu_sm is moved to CONFIG_INTEL_IOMMU enclave so that the pasid_supported() helper could be used in dmar.c without compilation errors. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Tested-by: Tony Zhu <tony.zhu@intel.com> Link: https://lore.kernel.org/r/20221031005917.45690-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-03KVM: x86: Fix a typo about the usage of kvcalloc()Liao Chang
Swap the 1st and 2nd arguments to be consistent with the usage of kvcalloc(). Fixes: c9b8fecddb5b ("KVM: use kvcalloc for array allocations") Signed-off-by: Liao Chang <liaochang1@huawei.com> Message-Id: <20221103011749.139262-1-liaochang1@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-03KVM: x86: Use SRCU to protect zap in __kvm_set_or_clear_apicv_inhibit()Ben Gardon
kvm_zap_gfn_range() must be called in an SRCU read-critical section, but there is no SRCU annotation in __kvm_set_or_clear_apicv_inhibit(). This can lead to the following warning via kvm_arch_vcpu_ioctl_set_guest_debug() if a Shadow MMU is in use (TDP MMU disabled or nesting): [ 1416.659809] ============================= [ 1416.659810] WARNING: suspicious RCU usage [ 1416.659839] 6.1.0-dbg-DEV #1 Tainted: G S I [ 1416.659853] ----------------------------- [ 1416.659854] include/linux/kvm_host.h:954 suspicious rcu_dereference_check() usage! [ 1416.659856] ... [ 1416.659904] dump_stack_lvl+0x84/0xaa [ 1416.659910] dump_stack+0x10/0x15 [ 1416.659913] lockdep_rcu_suspicious+0x11e/0x130 [ 1416.659919] kvm_zap_gfn_range+0x226/0x5e0 [ 1416.659926] ? kvm_make_all_cpus_request_except+0x18b/0x1e0 [ 1416.659935] __kvm_set_or_clear_apicv_inhibit+0xcc/0x100 [ 1416.659940] kvm_arch_vcpu_ioctl_set_guest_debug+0x350/0x390 [ 1416.659946] kvm_vcpu_ioctl+0x2fc/0x620 [ 1416.659955] __se_sys_ioctl+0x77/0xc0 [ 1416.659962] __x64_sys_ioctl+0x1d/0x20 [ 1416.659965] do_syscall_64+0x3d/0x80 [ 1416.659969] entry_SYSCALL_64_after_hwframe+0x63/0xcd Always take the KVM SRCU read lock in __kvm_set_or_clear_apicv_inhibit() to protect the GFN to memslot translation. The SRCU read lock is not technically required when no Shadow MMUs are in use, since the TDP MMU walks the paging structures from the roots and does not need to look up GFN translations in the memslots, but make the SRCU locking unconditional for simplicty. In most cases, the SRCU locking is taken care of in the vCPU run loop, but when called through other ioctls (such as KVM_SET_GUEST_DEBUG) there is no srcu_read_lock. Tested: ran tools/testing/selftests/kvm/x86_64/debug_regs on a DBG build. This patch causes the suspicious RCU warning to disappear. Note that the warning is hit in __kvm_zap_rmaps(), so kvm_memslots_have_rmaps() must return true in order for this to repro (i.e. the TDP MMU must be off or nesting in use.) Reported-by: Greg Thelen <gthelen@google.com> Fixes: 36222b117e36 ("KVM: x86: don't disable APICv memslot when inhibited") Signed-off-by: Ben Gardon <bgardon@google.com> Message-Id: <20221102205359.1260980-1-bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-03spi: stm32: fix stm32_spi_prepare_mbr() that halves spi clk for every runSean Nyekjaer
When this driver is used with a driver that uses preallocated spi_transfer structs. The speed_hz is halved by every run. This results in: spi_stm32 44004000.spi: SPI transfer setup failed ads7846 spi0.0: SPI transfer failed: -22 Example when running with DIV_ROUND_UP(): - First run; speed_hz = 1000000, spi->clk_rate 125000000 div 125 -> mbrdiv = 7, cur_speed = 976562 - Second run; speed_hz = 976562 div 128,00007 (roundup to 129) -> mbrdiv = 8, cur_speed = 488281 - Third run; speed_hz = 488281 div 256,000131072067109 (roundup to 257) and then -EINVAL is returned. Use DIV_ROUND_CLOSEST to allow to round down and allow us to keep the set speed. Signed-off-by: Sean Nyekjaer <sean@geanix.com> Link: https://lore.kernel.org/r/20221103080043.3033414-1-sean@geanix.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-11-03bpf, sock_map: Move cancel_work_sync() out of sock lockCong Wang
Stanislav reported a lockdep warning, which is caused by the cancel_work_sync() called inside sock_map_close(), as analyzed below by Jakub: psock->work.func = sk_psock_backlog() ACQUIRE psock->work_mutex sk_psock_handle_skb() skb_send_sock() __skb_send_sock() sendpage_unlocked() kernel_sendpage() sock->ops->sendpage = inet_sendpage() sk->sk_prot->sendpage = tcp_sendpage() ACQUIRE sk->sk_lock tcp_sendpage_locked() RELEASE sk->sk_lock RELEASE psock->work_mutex sock_map_close() ACQUIRE sk->sk_lock sk_psock_stop() sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED) cancel_work_sync() __cancel_work_timer() __flush_work() // wait for psock->work to finish RELEASE sk->sk_lock We can move the cancel_work_sync() out of the sock lock protection, but still before saved_close() was called. Fixes: 799aa7f98d53 ("skmsg: Avoid lock_sock() in sk_psock_backlog()") Reported-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20221102043417.279409-1-xiyou.wangcong@gmail.com
2022-11-03tools/headers: Pull in stddef.h to uapi to fix BPF selftests build in CIAndrii Nakryiko
With recent sync of linux/in.h tools/include headers are now relying on __DECLARE_FLEX_ARRAY macro, which isn't itself defined inside tools/include headers anywhere and is instead assumed to be present in system-wide UAPI header. This breaks isolated environments that don't have kernel UAPI headers installed system-wide, like BPF CI ([0]). To fix this, bring in include/uapi/linux/stddef.h into tools/include. We can't just copy/paste it, though, it has to be processed with scripts/headers_install.sh, which has a dependency on scripts/unifdef. So the full command to (re-)generate stddef.h for inclusion into tools/include directory is: $ make scripts_unifdef && \ cp $KBUILD_OUTPUT/scripts/unifdef scripts/ && \ scripts/headers_install.sh include/uapi/linux/stddef.h tools/include/uapi/linux/stddef.h This assumes KBUILD_OUTPUT envvar is set and used for out-of-tree builds. [0] https://github.com/kernel-patches/bpf/actions/runs/3379432493/jobs/5610982609 Fixes: 036b8f5b8970 ("tools headers uapi: Update linux/in.h copy") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/bpf/20221102182517.2675301-2-andrii@kernel.org
2022-11-03net/ipv4: Fix linux/in.h header dependenciesAndrii Nakryiko
__DECLARE_FLEX_ARRAY is defined in include/uapi/linux/stddef.h but doesn't seem to be explicitly included from include/uapi/linux/in.h, which breaks BPF selftests builds (once we sync linux/stddef.h into tools/include directory in the next patch). Fix this by explicitly including linux/stddef.h. Given this affects BPF CI and bpf tree, targeting this for bpf tree. Fixes: 5854a09b4957 ("net/ipv4: Use __DECLARE_FLEX_ARRAY() helper") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/bpf/20221102182517.2675301-1-andrii@kernel.org
2022-11-03Merge branch 'vsock-remove-an-unused-variable-and-fix-infinite-sleep'Paolo Abeni
Dexuan Cui says: ==================== vsock: remove an unused variable and fix infinite sleep Patch 1 removes the unused 'wait' variable. Patch 2 fixes an infinite sleep issue reported by a hv_sock user. ==================== Link: https://lore.kernel.org/r/20221101021706.26152-1-decui@microsoft.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-03vsock: fix possible infinite sleep in vsock_connectible_wait_data()Dexuan Cui
Currently vsock_connectible_has_data() may miss a wakeup operation between vsock_connectible_has_data() == 0 and the prepare_to_wait(). Fix the race by adding the process to the wait queue before checking vsock_connectible_has_data(). Fixes: b3f7fd54881b ("af_vsock: separate wait data loop") Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reported-by: Frédéric Dalleau <frederic.dalleau@docker.com> Tested-by: Frédéric Dalleau <frederic.dalleau@docker.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-03vsock: remove the unused 'wait' in vsock_connectible_recvmsg()Dexuan Cui
Remove the unused variable introduced by 19c1b90e1979. Fixes: 19c1b90e1979 ("af_vsock: separate receive data loop") Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-03x86/xen: simplify sysenter and syscall setupJuergen Gross
xen_enable_sysenter() and xen_enable_syscall() can be simplified a lot. While at it, switch to use cpu_feature_enabled() instead of boot_cpu_has(). Signed-off-by: Juergen Gross <jgross@suse.com>