summaryrefslogtreecommitdiff
path: root/tools/include/uapi/linux
AgeCommit message (Collapse)Author
2024-12-27Merge tag 'hardening-v6.13-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening fix from Kees Cook: - stddef: make __struct_group() UAPI C++-friendly (Alexander Lobakin) * tag 'hardening-v6.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: stddef: make __struct_group() UAPI C++-friendly
2024-12-20stddef: make __struct_group() UAPI C++-friendlyAlexander Lobakin
For the most part of the C++ history, it couldn't have type declarations inside anonymous unions for different reasons. At the same time, __struct_group() relies on the latters, so when the @TAG argument is not empty, C++ code doesn't want to build (even under `extern "C"`): ../linux/include/uapi/linux/pkt_cls.h:25:24: error: 'struct tc_u32_sel::<unnamed union>::tc_u32_sel_hdr,' invalid; an anonymous union may only have public non-static data members [-fpermissive] The safest way to fix this without trying to switch standards (which is impossible in UAPI anyway) etc., is to disable tag declaration for that language. This won't break anything since for now it's not buildable at all. Use a separate definition for __struct_group() when __cplusplus is defined to mitigate the error, including the version from tools/. Fixes: 50d7bd38c3aa ("stddef: Introduce struct_group() helper macro") Reported-by: Christopher Ferris <cferris@google.com> Closes: https://lore.kernel.org/linux-hardening/Z1HZpe3WE5As8UAz@google.com Suggested-by: Kees Cook <kees@kernel.org> # __struct_group_tag() Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20241219135734.2130002-1-aleksander.lobakin@intel.com Signed-off-by: Kees Cook <kees@kernel.org>
2024-12-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfAlexei Starovoitov
Cross-merge bpf fixes after downstream PR. No conflicts. Adjacent changes in: Auto-merging include/linux/bpf.h Auto-merging include/linux/bpf_verifier.h Auto-merging kernel/bpf/btf.c Auto-merging kernel/bpf/verifier.c Auto-merging kernel/trace/bpf_trace.c Auto-merging tools/testing/selftests/bpf/progs/test_tp_btf_nullable.c Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-12-13bpf: Add fd_array_cnt attribute for prog_loadAnton Protopopov
The fd_array attribute of the BPF_PROG_LOAD syscall may contain a set of file descriptors: maps or btfs. This field was introduced as a sparse array. Introduce a new attribute, fd_array_cnt, which, if present, indicates that the fd_array is a continuous array of the corresponding length. If fd_array_cnt is non-zero, then every map in the fd_array will be bound to the program, as if it was used by the program. This functionality is similar to the BPF_PROG_BIND_MAP syscall, but such maps can be used by the verifier during the program load. Signed-off-by: Anton Protopopov <aspsk@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241213130934.1087929-5-aspsk@isovalent.com
2024-12-04tools headers: Sync uapi/linux/kvm.h with the kernel sourcesNamhyung Kim
To pick up the changes in this cset: e785dfacf7e7fe94 ("LoongArch: KVM: Add PCHPIC device support") 2e8b9df82631e714 ("LoongArch: KVM: Add EIOINTC device support") c532de5a67a70f85 ("LoongArch: KVM: Add IPI device support") This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h Please see tools/include/uapi/README for further details. Reviewed-by: James Clark <james.clark@linaro.org> Cc: Sean Christopherson <seanjc@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Tianrui Zhao <zhaotianrui@loongson.cn> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: kvm@vger.kernel.org Cc: loongarch@lists.linux.dev Link: https://lore.kernel.org/r/20241203035349.1901262-4-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-04tools headers: Sync uapi/linux/perf_event.h with the kernel sourcesNamhyung Kim
To pick up the changes in this cset: 18d92bb57c39504d ("perf/core: Add aux_pause, aux_resume, aux_start_paused") This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h Please see tools/include/uapi/README for further details. Reviewed-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20241203035349.1901262-3-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-21Merge tag 'net-next-6.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Paolo Abeni: "The most significant set of changes is the per netns RTNL. The new behavior is disabled by default, regression risk should be contained. Notably the new config knob PTP_1588_CLOCK_VMCLOCK will inherit its default value from PTP_1588_CLOCK_KVM, as the first is intended to be a more reliable replacement for the latter. Core: - Started a very large, in-progress, effort to make the RTNL lock scope per network-namespace, thus reducing the lock contention significantly in the containerized use-case, comprising: - RCU-ified some relevant slices of the FIB control path - introduce basic per netns locking helpers - namespacified the IPv4 address hash table - remove rtnl_register{,_module}() in favour of rtnl_register_many() - refactor rtnl_{new,del,set}link() moving as much validation as possible out of RTNL lock - convert all phonet doit() and dumpit() handlers to RCU - convert IPv4 addresses manipulation to per-netns RTNL - convert virtual interface creation to per-netns RTNL the per-netns lock infrastructure is guarded by the CONFIG_DEBUG_NET_SMALL_RTNL knob, disabled by default ad interim. - Introduce NAPI suspension, to efficiently switching between busy polling (NAPI processing suspended) and normal processing. - Migrate the IPv4 routing input, output and control path from direct ToS usage to DSCP macros. This is a work in progress to make ECN handling consistent and reliable. - Add drop reasons support to the IPv4 rotue input path, allowing better introspection in case of packets drop. - Make FIB seqnum lockless, dropping RTNL protection for read access. - Make inet{,v6} addresses hashing less predicable. - Allow providing timestamp OPT_ID via cmsg, to correlate TX packets and timestamps Things we sprinkled into general kernel code: - Add small file operations for debugfs, to reduce the struct ops size. - Refactoring and optimization for the implementation of page_frag API, This is a preparatory work to consolidate the page_frag implementation. Netfilter: - Optimize set element transactions to reduce memory consumption - Extended netlink error reporting for attribute parser failure. - Make legacy xtables configs user selectable, giving users the option to configure iptables without enabling any other config. - Address a lot of false-positive RCU issues, pointed by recent CI improvements. BPF: - Put xsk sockets on a struct diet and add various cleanups. Overall, this helps to bump performance by 12% for some workloads. - Extend BPF selftests to increase coverage of XDP features in combination with BPF cpumap. - Optimize and homogenize bpf_csum_diff helper for all archs and also add a batch of new BPF selftests for it. - Extend netkit with an option to delegate skb->{mark,priority} scrubbing to its BPF program. - Make the bpf_get_netns_cookie() helper available also to tc(x) BPF programs. Protocols: - Introduces 4-tuple hash for connected udp sockets, speeding-up significantly connected sockets lookup. - Add a fastpath for some TCP timers that usually expires after close, the socket lock contention. - Add inbound and outbound xfrm state caches to speed up state lookups. - Avoid sending MPTCP advertisements on stale subflows, reducing risks on loosing them. - Make neighbours table flushing more scalable, maintaining per device neigh lists. Driver API: - Introduce a unified interface to configure transmission H/W shaping, and expose it to user-space via generic-netlink. - Add support for per-NAPI config via netlink. This makes napi configuration persistent across queues removal and re-creation. Requires driver updates, currently supported drivers are: nVidia/Mellanox mlx4 and mlx5, Broadcom brcm and Intel ice. - Add ethtool support for writing SFP / PHY firmware blocks. - Track RSS context allocation from ethtool core. - Implement support for mirroring to DSA CPU port, via TC mirror offload. - Consolidate FDB updates notification, to avoid duplicates on device-specific entries. - Expose DPLL clock quality level to the user-space. - Support master-slave PHY config via device tree. Tests and tooling: - forwarding: introduce deferred commands, to simplify the cleanup phase Drivers: - Updated several drivers - Amazon vNic, Google vNic, Microsoft vNic, Intel e1000e and Broadcom Tigon3 - to use netdev-genl to link the IRQs and queues to NAPI IDs, allowing busy polling and better introspection. - Ethernet high-speed NICs: - nVidia/Mellanox: - mlx5: - a large refactor to implement support for cross E-Switch scheduling - refactor H/W conter management to let it scale better - H/W GRO cleanups - Intel (100G, ice):: - add support for ethtool reset - implement support for per TX queue H/W shaping - AMD/Solarflare: - implement per device queue stats support - Broadcom (bnxt): - improve wildcard l4proto on IPv4/IPv6 ntuple rules - Marvell Octeon: - Add representor support for each Resource Virtualization Unit (RVU) device. - Hisilicon: - add support for the BMC Gigabit Ethernet - IBM (EMAC): - driver cleanup and modernization - Cisco (VIC): - raise the queues number limit to 256 - Ethernet virtual: - Google vNIC: - implement page pool support - macsec: - inherit lower device's features and TSO limits when offloading - virtio_net: - enable premapped mode by default - support for XDP socket(AF_XDP) zerocopy TX - wireguard: - set the TSO max size to be GSO_MAX_SIZE, to aggregate larger packets. - Ethernet NICs embedded and virtual: - Broadcom ASP: - enable software timestamping - Freescale: - add enetc4 PF driver - MediaTek: Airoha SoC: - implement BQL support - RealTek r8169: - enable TSO by default on r8168/r8125 - implement extended ethtool stats - Renesas AVB: - enable TX checksum offload - Synopsys (stmmac): - support header splitting for vlan tagged packets - move common code for DWMAC4 and DWXGMAC into a separate FPE module. - add dwmac driver support for T-HEAD TH1520 SoC - Synopsys (xpcs): - driver refactor and cleanup - TI: - icssg_prueth: add VLAN offload support - Xilinx emaclite: - add clock support - Ethernet switches: - Microchip: - implement support for the lan969x Ethernet switch family - add LAN9646 switch support to KSZ DSA driver - Ethernet PHYs: - Marvel: 88q2x: enable auto negotiation - Microchip: add support for LAN865X Rev B1 and LAN867X Rev C1/C2 - PTP: - Add support for the Amazon virtual clock device - Add PtP driver for s390 clocks - WiFi: - mac80211 - EHT 1024 aggregation size for transmissions - new operation to indicate that a new interface is to be added - support radio separation of multi-band devices - move wireless extension spy implementation to libiw - Broadcom: - brcmfmac: optional LPO clock support - Microchip: - add support for Atmel WILC3000 - Qualcomm (ath12k): - firmware coredump collection support - add debugfs support for a multitude of statistics - Qualcomm (ath5k): - Arcadyan ARV45XX AR2417 & Gigaset SX76[23] AR241[34]A support - Realtek: - rtw88: 8821au and 8812au USB adapters support - rtw89: add thermal protection - rtw89: fine tune BT-coexsitence to improve user experience - rtw89: firmware secure boot for WiFi 6 chip - Bluetooth - add Qualcomm WCN785x support for ids Foxconn 0xe0fc/0xe0f3 and 0x13d3:0x3623 - add Realtek RTL8852BE support for id Foxconn 0xe123 - add MediaTek MT7920 support for wireless module ids - btintel_pcie: add handshake between driver and firmware - btintel_pcie: add recovery mechanism - btnxpuart: add GPIO support to power save feature" * tag 'net-next-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1475 commits) mm: page_frag: fix a compile error when kernel is not compiled Documentation: tipc: fix formatting issue in tipc.rst selftests: nic_performance: Add selftest for performance of NIC driver selftests: nic_link_layer: Add selftest case for speed and duplex states selftests: nic_link_layer: Add link layer selftest for NIC driver bnxt_en: Add FW trace coredump segments to the coredump bnxt_en: Add a new ethtool -W dump flag bnxt_en: Add 2 parameters to bnxt_fill_coredump_seg_hdr() bnxt_en: Add functions to copy host context memory bnxt_en: Do not free FW log context memory bnxt_en: Manage the FW trace context memory bnxt_en: Allocate backing store memory for FW trace logs bnxt_en: Add a 'force' parameter to bnxt_free_ctx_mem() bnxt_en: Refactor bnxt_free_ctx_mem() bnxt_en: Add mem_valid bit to struct bnxt_ctx_mem_type bnxt_en: Update firmware interface spec to 1.10.3.85 selftests/bpf: Add some tests with sockmap SK_PASS bpf: fix recursive lock when verdict program return SK_PASS wireguard: device: support big tcp GSO wireguard: selftests: load nf_conntrack if not present ...
2024-11-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfAlexei Starovoitov
Cross-merge bpf fixes after downstream PR. In particular to bring the fix in commit aa30eb3260b2 ("bpf: Force checkpoint when jmp history is too long"). The follow up verifier work depends on it. And the fix in commit 6801cf7890f2 ("selftests/bpf: Use -4095 as the bad address for bits iterator"). It's fixing instability of BPF CI on s390 arch. No conflicts. Adjacent changes in: Auto-merging arch/Kconfig Auto-merging kernel/bpf/helpers.c Auto-merging kernel/bpf/memalloc.c Auto-merging kernel/bpf/verifier.c Auto-merging mm/slab_common.c Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-11net: Add napi_struct parameter irq_suspend_timeoutMartin Karsten
Add a per-NAPI IRQ suspension parameter, which can be get/set with netdev-genl. This patch doesn't change any behavior but prepares the code for other changes in the following commits which use irq_suspend_timeout as a timeout for IRQ suspension. Signed-off-by: Martin Karsten <mkarsten@uwaterloo.ca> Co-developed-by: Joe Damato <jdamato@fastly.com> Signed-off-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Tested-by: Martin Karsten <mkarsten@uwaterloo.ca> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Link: https://patch.msgid.link/20241109050245.191288-2-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11bpf: Add support for uprobe multi session attachJiri Olsa
Adding support to attach BPF program for entry and return probe of the same function. This is common use case which at the moment requires to create two uprobe multi links. Adding new BPF_TRACE_UPROBE_SESSION attach type that instructs kernel to attach single link program to both entry and exit probe. It's possible to control execution of the BPF program on return probe simply by returning zero or non zero from the entry BPF program execution to execute or not the BPF program on return probe respectively. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241108134544.480660-4-jolsa@kernel.org
2024-10-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.12-rc6). Conflicts: drivers/net/wireless/intel/iwlwifi/mvm/mld-mac80211.c cbe84e9ad5e2 ("wifi: iwlwifi: mvm: really send iwl_txpower_constraints_cmd") 188a1bf89432 ("wifi: mac80211: re-order assigning channel in activate links") https://lore.kernel.org/all/20241028123621.7bbb131b@canb.auug.org.au/ net/mac80211/cfg.c c4382d5ca1af ("wifi: mac80211: update the right link for tx power") 8dd0498983ee ("wifi: mac80211: Fix setting txpower with emulate_chanctx") drivers/net/ethernet/intel/ice/ice_ptp_hw.h 6e58c3310622 ("ice: fix crash on probe for DPLL enabled E810 LOM") e4291b64e118 ("ice: Align E810T GPIO to other products") ebb2693f8fbd ("ice: Read SDP section from NVM for pin definitions") ac532f4f4251 ("ice: Cleanup unused declarations") https://lore.kernel.org/all/20241030120524.1ee1af18@canb.auug.org.au/ No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30Merge tag 'perf-tools-fixes-for-v6.12-2-2024-10-30' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Arnaldo Carvalho de Melo: - Update more header copies with the kernel sources, including const.h, msr-index.h, arm64's cputype.h, kvm's, bits.h and unaligned.h - The return from 'write' isn't a pid, fix cut'n'paste error in 'perf trace' - Fix up the python binding build on architectures without HAVE_KVM_STAT_SUPPORT - Add some more bounds checks to augmented_raw_syscalls.bpf.c (used to collect syscall pointer arguments in 'perf trace') to make the resulting bytecode to pass the kernel BPF verifier, allowing us to go back accepting clang 12.0.1 as the minimum version required for compiling BPF sources - Add __NR_capget for x86 to fix a regression on running perf + intel PT (hw tracing) as non-root setting up the capabilities as described in https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html - Fix missing syscalltbl in non-explicitly listed architectures, noticed on ARM 32-bit, that still needs a .tbl generator for the syscall id<->name tables, should be added for v6.13 - Handle 'perf test' failure when handling broken DWARF for ASM files * tag 'perf-tools-fixes-for-v6.12-2-2024-10-30' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: perf cap: Add __NR_capget to arch/x86 unistd tools headers: Update the linux/unaligned.h copy with the kernel sources tools headers arm64: Sync arm64's cputype.h with the kernel sources tools headers: Synchronize {uapi/}linux/bits.h with the kernel sources tools arch x86: Sync the msr-index.h copy with the kernel sources perf python: Fix up the build on architectures without HAVE_KVM_STAT_SUPPORT perf test: Handle perftool-testsuite_probe failure due to broken DWARF tools headers UAPI: Sync kvm headers with the kernel sources perf trace: Fix non-listed archs in the syscalltbl routines perf build: Change the clang check back to 12.0.1 perf trace augmented_raw_syscalls: Add more checks to pass the verifier perf trace augmented_raw_syscalls: Add extra array index bounds checking to satisfy some BPF verifiers perf trace: The return from 'write' isn't a pid tools headers UAPI: Sync linux/const.h with the kernel headers
2024-10-28tools headers: Synchronize {uapi/}linux/bits.h with the kernel sourcesArnaldo Carvalho de Melo
To pick up the changes in this cset: 947697c6f0f75f98 ("uapi: Define GENMASK_U128") This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/linux/bits.h include/uapi/linux/bits.h diff -u tools/include/linux/bits.h include/linux/bits.h Please see tools/include/uapi/README for further details. Acked-by: Yury Norov <yury.norov@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/Zx-ZVH7bHqtFn8Dv@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-10-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netPaolo Abeni
Cross-merge networking fixes after downstream PR. No conflicts and no adjacent changes. Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfAlexei Starovoitov
Cross-merge bpf fixes after downstream PR. No conflicts. Adjacent changes in: include/linux/bpf.h include/uapi/linux/bpf.h kernel/bpf/btf.c kernel/bpf/helpers.c kernel/bpf/syscall.c kernel/bpf/verifier.c kernel/trace/bpf_trace.c mm/slab_common.c tools/include/uapi/linux/bpf.h tools/testing/selftests/bpf/Makefile Link: https://lore.kernel.org/all/20241024215724.60017-1-daniel@iogearbox.net/ Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-24bpf: Add the missing BPF_LINK_TYPE invocation for sockmapHou Tao
There is an out-of-bounds read in bpf_link_show_fdinfo() for the sockmap link fd. Fix it by adding the missing BPF_LINK_TYPE invocation for sockmap link Also add comments for bpf_link_type to prevent missing updates in the future. Fixes: 699c23f02c65 ("bpf: Add bpf_link support for sk_msg and sk_skb progs") Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241024013558.1135167-2-houtao@huaweicloud.com
2024-10-18Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfLinus Torvalds
Pull bpf fixes from Daniel Borkmann: - Fix BPF verifier to not affect subreg_def marks in its range propagation (Eduard Zingerman) - Fix a truncation bug in the BPF verifier's handling of coerce_reg_to_size_sx (Dimitar Kanaliev) - Fix the BPF verifier's delta propagation between linked registers under 32-bit addition (Daniel Borkmann) - Fix a NULL pointer dereference in BPF devmap due to missing rxq information (Florian Kauer) - Fix a memory leak in bpf_core_apply (Jiri Olsa) - Fix an UBSAN-reported array-index-out-of-bounds in BTF parsing for arrays of nested structs (Hou Tao) - Fix build ID fetching where memory areas backing the file were created with memfd_secret (Andrii Nakryiko) - Fix BPF task iterator tid filtering which was incorrectly using pid instead of tid (Jordan Rome) - Several fixes for BPF sockmap and BPF sockhash redirection in combination with vsocks (Michal Luczaj) - Fix riscv BPF JIT and make BPF_CMPXCHG fully ordered (Andrea Parri) - Fix riscv BPF JIT under CONFIG_CFI_CLANG to prevent the possibility of an infinite BPF tailcall (Pu Lehui) - Fix a build warning from resolve_btfids that bpf_lsm_key_free cannot be resolved (Thomas Weißschuh) - Fix a bug in kfunc BTF caching for modules where the wrong BTF object was returned (Toke Høiland-Jørgensen) - Fix a BPF selftest compilation error in cgroup-related tests with musl libc (Tony Ambardar) - Several fixes to BPF link info dumps to fill missing fields (Tyrone Wu) - Add BPF selftests for kfuncs from multiple modules, checking that the correct kfuncs are called (Simon Sundberg) - Ensure that internal and user-facing bpf_redirect flags don't overlap (Toke Høiland-Jørgensen) - Switch to use kvzmalloc to allocate BPF verifier environment (Rik van Riel) - Use raw_spinlock_t in BPF ringbuf to fix a sleep in atomic splat under RT (Wander Lairson Costa) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: (38 commits) lib/buildid: Handle memfd_secret() files in build_id_parse() selftests/bpf: Add test case for delta propagation bpf: Fix print_reg_state's constant scalar dump bpf: Fix incorrect delta propagation between linked registers bpf: Properly test iter/task tid filtering bpf: Fix iter/task tid filtering riscv, bpf: Make BPF_CMPXCHG fully ordered bpf, vsock: Drop static vsock_bpf_prot initialization vsock: Update msg_count on read_skb() vsock: Update rx_bytes on read_skb() bpf, sockmap: SK_DROP on attempted redirects of unsupported af_vsock selftests/bpf: Add asserts for netfilter link info bpf: Fix link info netfilter flags to populate defrag flag selftests/bpf: Add test for sign extension in coerce_subreg_to_size_sx() selftests/bpf: Add test for truncation after sign extension in coerce_reg_to_size_sx() bpf: Fix truncation bug in coerce_reg_to_size_sx() selftests/bpf: Assert link info uprobe_multi count & path_size if unset bpf: Fix unpopulated path_size when uprobe_multi fields unset selftests/bpf: Fix cross-compiling urandom_read selftests/bpf: Add test for kfunc module order ...
2024-10-17tools headers UAPI: Sync linux/const.h with the kernel headersArnaldo Carvalho de Melo
To pick up the changes in: 947697c6f0f75f98 ("uapi: Define GENMASK_U128") That causes no changes in tooling, just addresses this perf build warning: Warning: Kernel ABI header differences: diff -u tools/include/uapi/linux/const.h include/uapi/linux/const.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/lkml/ZwltGNJwujKu1Fgn@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-10-15Merge tag 'for-netdev' of ↵Paolo Abeni
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2024-10-14 The following pull-request contains BPF updates for your *net-next* tree. We've added 21 non-merge commits during the last 18 day(s) which contain a total of 21 files changed, 1185 insertions(+), 127 deletions(-). The main changes are: 1) Put xsk sockets on a struct diet and add various cleanups. Overall, this helps to bump performance by 12% for some workloads, from Maciej Fijalkowski. 2) Extend BPF selftests to increase coverage of XDP features in combination with BPF cpumap, from Alexis Lothoré (eBPF Foundation). 3) Extend netkit with an option to delegate skb->{mark,priority} scrubbing to its BPF program, from Daniel Borkmann. 4) Make the bpf_get_netns_cookie() helper available also to tc(x) BPF programs, from Mahe Tardy. 5) Extend BPF selftests covering a BPF program setting socket options per MPTCP subflow, from Geliang Tang and Nicolas Rybowski. bpf-next-for-netdev * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (21 commits) xsk: Use xsk_buff_pool directly for cq functions xsk: Wrap duplicated code to function xsk: Carry a copy of xdp_zc_max_segs within xsk_buff_pool xsk: Get rid of xdp_buff_xsk::orig_addr xsk: s/free_list_node/list_node/ xsk: Get rid of xdp_buff_xsk::xskb_list_node selftests/bpf: check program redirect in xdp_cpumap_attach selftests/bpf: make xdp_cpumap_attach keep redirect prog attached selftests/bpf: fix bpf_map_redirect call for cpu map test selftests/bpf: add tcx netns cookie tests bpf: add get_netns_cookie helper to tc programs selftests/bpf: add missing header include for htons selftests/bpf: Extend netkit tests to validate skb meta data tools: Sync if_link.h uapi tooling header netkit: Add add netkit scrub support to rt_link.yaml netkit: Simplify netkit mode over to use NLA_POLICY_MAX netkit: Add option for scrubbing skb meta data bpf: Remove unused macro selftests/bpf: Add mptcp subflow subtest selftests/bpf: Add getsockopt to inspect mptcp subflow ... ==================== Link: https://patch.msgid.link/20241014211110.16562-1-daniel@iogearbox.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-14netdev-genl: Support setting per-NAPI config valuesJoe Damato
Add support to set per-NAPI defer_hard_irqs and gro_flush_timeout. Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241011184527.16393-7-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-14netdev-genl: Dump gro_flush_timeoutJoe Damato
Support dumping gro_flush_timeout for a NAPI ID. Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241011184527.16393-5-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-14netdev-genl: Dump napi_defer_hard_irqsJoe Damato
Support dumping defer_hard_irqs for a NAPI ID. Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20241011184527.16393-3-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10bpf: Update bpf_override_return() commentMartin Kelly
The documentation says CONFIG_FUNCTION_ERROR_INJECTION is supported only on x86. This was presumably true at the time of writing, but it's now supported on many other architectures too. Drop this statement, since it's not correct anymore and it fits better in other documentation anyway. Signed-off-by: Martin Kelly <martin.kelly@crowdstrike.com> Link: https://lore.kernel.org/r/20241010193301.995909-1-martin.kelly@crowdstrike.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.12-rc3). No conflicts and no adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10bpf: fix argument type in bpf_loop documentationMatteo Croce
The `index` argument to bpf_loop() is threaded as an u64. This lead in a subtle verifier denial where clang cloned the argument in another register[1]. [1] https://github.com/systemd/systemd/pull/34650#issuecomment-2401092895 Signed-off-by: Matteo Croce <teknoraver@meta.com> Link: https://lore.kernel.org/r/20241010035652.17830-1-technoboy85@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-07tools: Sync if_link.h uapi tooling headerDaniel Borkmann
Sync if_link uapi header to the latest version as we need the refresher in tooling for netkit device. Given it's been a while since the last sync and the diff is fairly big, it has been done as its own commit. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20241004101335.117711-4-daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-10-04net: add IFLA_MAX_PACING_OFFLOAD_HORIZON device attributeEric Dumazet
Some network devices have the ability to offload EDT (Earliest Departure Time) which is the model used for TCP pacing and FQ packet scheduler. Some of them implement the timing wheel mechanism described in https://saeed.github.io/files/carousel-sigcomm17.pdf with an associated 'timing wheel horizon'. This patch adds dev->max_pacing_offload_horizon expressing this timing wheel horizon in nsec units. This is a read-only attribute. Unless a driver sets it, dev->max_pacing_offload_horizon is zero. v2: addressed Jakub feedback ( https://lore.kernel.org/netdev/20240930152304.472767-2-edumazet@google.com/T/#mf6294d714c41cc459962154cc2580ce3c9693663 ) v3: added yaml doc (also per Jakub feedback) Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20241003121219.2396589-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-03bpf: __bpf_fastcall for bpf_get_smp_processor_id in uapiEduard Zingerman
Since [1] kernel supports __bpf_fastcall attribute for helper function bpf_get_smp_processor_id(). Update uapi definition for this helper in order to have this attribute in the generated bpf_helper_defs.h [1] commit 91b7fbf3936f ("bpf, x86, riscv, arm: no_caller_saved_registers for bpf_get_smp_processor_id()") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240916091712.2929279-3-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-01bpf: Sync uapi bpf.h header to tools directoryDaniel Borkmann
There is a delta between kernel UAPI bpf.h and tools UAPI bpf.h, thus sync them again. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2024-09-30tools headers UAPI: Sync the linux/in.h with the kernel sourcesArnaldo Carvalho de Melo
Picking the changes from: 70d0bb45fae87a3b ("net: Correct spelling in headers") Just a comment fix, addressing this perf build warning: Warning: Kernel ABI header differences: diff -u tools/include/uapi/linux/in.h include/uapi/linux/in.h Please see tools/include/uapi/README for details. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/lkml/ZvrNlLdtXAZ1sIIj@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-21Merge tag 'bpf-next-6.12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf updates from Alexei Starovoitov: - Introduce '__attribute__((bpf_fastcall))' for helpers and kfuncs with corresponding support in LLVM. It is similar to existing 'no_caller_saved_registers' attribute in GCC/LLVM with a provision for backward compatibility. It allows compilers generate more efficient BPF code assuming the verifier or JITs will inline or partially inline a helper/kfunc with such attribute. bpf_cast_to_kern_ctx, bpf_rdonly_cast, bpf_get_smp_processor_id are the first set of such helpers. - Harden and extend ELF build ID parsing logic. When called from sleepable context the relevants parts of ELF file will be read to find and fetch .note.gnu.build-id information. Also harden the logic to avoid TOCTOU, overflow, out-of-bounds problems. - Improvements and fixes for sched-ext: - Allow passing BPF iterators as kfunc arguments - Make the pointer returned from iter_next method trusted - Fix x86 JIT convergence issue due to growing/shrinking conditional jumps in variable length encoding - BPF_LSM related: - Introduce few VFS kfuncs and consolidate them in fs/bpf_fs_kfuncs.c - Enforce correct range of return values from certain LSM hooks - Disallow attaching to other LSM hooks - Prerequisite work for upcoming Qdisc in BPF: - Allow kptrs in program provided structs - Support for gen_epilogue in verifier_ops - Important fixes: - Fix uprobe multi pid filter check - Fix bpf_strtol and bpf_strtoul helpers - Track equal scalars history on per-instruction level - Fix tailcall hierarchy on x86 and arm64 - Fix signed division overflow to prevent INT_MIN/-1 trap on x86 - Fix get kernel stack in BPF progs attached to tracepoint:syscall - Selftests: - Add uprobe bench/stress tool - Generate file dependencies to drastically improve re-build time - Match JIT-ed and BPF asm with __xlated/__jited keywords - Convert older tests to test_progs framework - Add support for RISC-V - Few fixes when BPF programs are compiled with GCC-BPF backend (support for GCC-BPF in BPF CI is ongoing in parallel) - Add traffic monitor - Enable cross compile and musl libc * tag 'bpf-next-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (260 commits) btf: require pahole 1.21+ for DEBUG_INFO_BTF with default DWARF version btf: move pahole check in scripts/link-vmlinux.sh to lib/Kconfig.debug btf: remove redundant CONFIG_BPF test in scripts/link-vmlinux.sh bpf: Call the missed kfree() when there is no special field in btf bpf: Call the missed btf_record_free() when map creation fails selftests/bpf: Add a test case to write mtu result into .rodata selftests/bpf: Add a test case to write strtol result into .rodata selftests/bpf: Rename ARG_PTR_TO_LONG test description selftests/bpf: Fix ARG_PTR_TO_LONG {half-,}uninitialized test bpf: Zero former ARG_PTR_TO_{LONG,INT} args in case of error bpf: Improve check_raw_mode_ok test for MEM_UNINIT-tagged types bpf: Fix helper writes to read-only maps bpf: Remove truncation test in bpf_strtol and bpf_strtoul helpers bpf: Fix bpf_strtol and bpf_strtoul helpers for 32bit selftests/bpf: Add tests for sdiv/smod overflow cases bpf: Fix a sdiv overflow issue libbpf: Add bpf_object__token_fd accessor docs/bpf: Add missing BPF program types to docs docs/bpf: Add constant values for linkages bpf: Use fake pt_regs when doing bpf syscall tracepoint tracing ...
2024-09-11netdev: add dmabuf introspectionMina Almasry
Add dmabuf information to page_pool stats: $ ./cli.py --spec ../netlink/specs/netdev.yaml --dump page-pool-get ... {'dmabuf': 10, 'id': 456, 'ifindex': 3, 'inflight': 1023, 'inflight-mem': 4190208}, {'dmabuf': 10, 'id': 455, 'ifindex': 3, 'inflight': 1023, 'inflight-mem': 4190208}, {'dmabuf': 10, 'id': 454, 'ifindex': 3, 'inflight': 1023, 'inflight-mem': 4190208}, {'dmabuf': 10, 'id': 453, 'ifindex': 3, 'inflight': 1023, 'inflight-mem': 4190208}, {'dmabuf': 10, 'id': 452, 'ifindex': 3, 'inflight': 1023, 'inflight-mem': 4190208}, {'dmabuf': 10, 'id': 451, 'ifindex': 3, 'inflight': 1023, 'inflight-mem': 4190208}, {'dmabuf': 10, 'id': 450, 'ifindex': 3, 'inflight': 1023, 'inflight-mem': 4190208}, {'dmabuf': 10, 'id': 449, 'ifindex': 3, 'inflight': 1023, 'inflight-mem': 4190208}, And queue stats: $ ./cli.py --spec ../netlink/specs/netdev.yaml --dump queue-get ... {'dmabuf': 10, 'id': 8, 'ifindex': 3, 'type': 'rx'}, {'dmabuf': 10, 'id': 9, 'ifindex': 3, 'type': 'rx'}, {'dmabuf': 10, 'id': 10, 'ifindex': 3, 'type': 'rx'}, {'dmabuf': 10, 'id': 11, 'ifindex': 3, 'type': 'rx'}, {'dmabuf': 10, 'id': 12, 'ifindex': 3, 'type': 'rx'}, {'dmabuf': 10, 'id': 13, 'ifindex': 3, 'type': 'rx'}, {'dmabuf': 10, 'id': 14, 'ifindex': 3, 'type': 'rx'}, {'dmabuf': 10, 'id': 15, 'ifindex': 3, 'type': 'rx'}, Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20240910171458.219195-14-almasrymina@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-11net: netdev netlink api to bind dma-buf to a net deviceMina Almasry
API takes the dma-buf fd as input, and binds it to the netdevice. The user can specify the rx queues to bind the dma-buf to. Suggested-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20240910171458.219195-3-almasrymina@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-26Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2024-08-23 We've added 10 non-merge commits during the last 15 day(s) which contain a total of 10 files changed, 222 insertions(+), 190 deletions(-). The main changes are: 1) Add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_*sockopt() to address the case when long-lived sockets miss a chance to set additional callbacks if a sockops program was not attached early in their lifetime, from Alan Maguire. 2) Add a batch of BPF selftest improvements which fix a few bugs and add missing features to improve the test coverage of sockmap/sockhash, from Michal Luczaj. 3) Fix a false-positive Smatch-reported off-by-one in tcp_validate_cookie() which is part of the test_tcp_custom_syncookie BPF selftest, from Kuniyuki Iwashima. 4) Fix the flow_dissector BPF selftest which had a bug in IP header's tot_len calculation doing subtraction after htons() instead of inside htons(), from Asbjørn Sloth Tønnesen. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: selftest: bpf: Remove mssind boundary check in test_tcp_custom_syncookie.c. selftests/bpf: Introduce __attribute__((cleanup)) in create_pair() selftests/bpf: Exercise SOCK_STREAM unix_inet_redir_to_connected() selftests/bpf: Honour the sotype of af_unix redir tests selftests/bpf: Simplify inet_socketpair() and vsock_socketpair_connectible() selftests/bpf: Socket pair creation, cleanups selftests/bpf: Support more socket types in create_pair() selftests/bpf: Avoid subtraction after htons() in ipip tests selftests/bpf: add sockopt tests for TCP_BPF_SOCK_OPS_CB_FLAGS bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flags ==================== Link: https://patch.msgid.link/20240823134959.1091-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-23bpf: Add bpf_copy_from_user_str kfuncJordan Rome
This adds a kfunc wrapper around strncpy_from_user, which can be called from sleepable BPF programs. This matches the non-sleepable 'bpf_probe_read_user_str' helper except it includes an additional 'flags' param, which allows consumers to clear the entire destination buffer on success or failure. Signed-off-by: Jordan Rome <linux@jordanrome.com> Link: https://lore.kernel.org/r/20240823195101.3621028-1-linux@jordanrome.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-08-08bpf/bpf_get,set_sockopt: add option to set TCP-BPF sock ops flagsAlan Maguire
Currently the only opportunity to set sock ops flags dictating which callbacks fire for a socket is from within a TCP-BPF sockops program. This is problematic if the connection is already set up as there is no further chance to specify callbacks for that socket. Add TCP_BPF_SOCK_OPS_CB_FLAGS to bpf_setsockopt() and bpf_getsockopt() to allow users to specify callbacks later, either via an iterator over sockets or via a socket-specific program triggered by a setsockopt() on the socket. Previous discussion on this here [1]. [1] https://lore.kernel.org/bpf/f42f157b-6e52-dd4d-3d97-9b86c84c0b00@oracle.com/ Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/r/20240808150558.1035626-2-alan.maguire@oracle.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-08-07tools/include: Sync filesystem headers with the kernel sourcesNamhyung Kim
To pick up changes from: 0f9ca80fa4f9 fs: Add initial atomic write support info to statx f9af549d1fd3 fs: export mount options via statmount() 0a3deb11858a fs: Allow listmount() in foreign mount namespace 09b31295f833 fs: export the mount ns id via statmount d04bccd8c19d listmount: allow listing in reverse order bfc69fd05ef9 fs/procfs: add build ID fetching to PROCMAP_QUERY API ed5d583a88a9 fs/procfs: implement efficient VMA querying API for /proc/<pid>/maps This should be used to beautify FS syscall arguments and it addresses these tools/perf build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/linux/stat.h include/uapi/linux/stat.h diff -u tools/perf/trace/beauty/include/uapi/linux/fs.h include/uapi/linux/fs.h diff -u tools/perf/trace/beauty/include/uapi/linux/mount.h include/uapi/linux/mount.h diff -u tools/perf/trace/beauty/include/uapi/linux/stat.h include/uapi/linux/stat.h Please see tools/include/uapi/README for details (it's in the first patch of this series). Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-08-07tools/include: Sync network socket headers with the kernel sourcesNamhyung Kim
To pick up changes from: d25a92ccae6b net/smc: Introduce IPPROTO_SMC 060f4ba6e403 io_uring/net: move charging socket out of zc io_uring bb6aaf736680 net: Split a __sys_listen helper for io_uring dc2e77979412 net: Split a __sys_bind helper for io_uring This should be used to beautify socket syscall arguments and it addresses these tools/perf build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/linux/in.h include/uapi/linux/in.h diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h Please see tools/include/uapi/README for details (it's in the first patch of this series). Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: netdev@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-08-06tools/include: Sync uapi/linux/perf.h with the kernel sourcesNamhyung Kim
To pick up changes from: 608f6976c309 perf/x86/intel: Support new data source for Lunar Lake This should be used to beautify perf syscall arguments and it addresses these tools/perf build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h Please see tools/include/uapi/README for details (it's in the first patch of this series). Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: "Liang, Kan" <kan.liang@linux.intel.com> Cc: linux-perf-users@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-08-06tools/include: Sync uapi/linux/kvm.h with the kernel sourcesNamhyung Kim
And other arch-specific UAPI headers to pick up changes from: 4b23e0c199b2 KVM: Ensure new code that references immediate_exit gets extra scrutiny 85542adb65ec KVM: x86: Add KVM_RUN_X86_GUEST_MODE kvm_run flag 6fef518594bc KVM: x86: Add a capability to configure bus frequency for APIC timer 34ff65901735 x86/sev: Use kernel provided SVSM Calling Areas 5dcc1e76144f Merge tag 'kvm-x86-misc-6.11' of https://github.com/kvm-x86/linux into HEAD 9a0d2f4995dd KVM: PPC: Book3S HV: Add one-reg interface for HASHPKEYR register e9eb790b2557 KVM: PPC: Book3S HV: Add one-reg interface for HASHKEYR register 1a1e6865f516 KVM: PPC: Book3S HV: Add one-reg interface for DEXCR register This should be used to beautify KVM syscall arguments and it addresses these tools/perf build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h diff -u tools/arch/x86/include/uapi/asm/svm.h arch/x86/include/uapi/asm/svm.h diff -u tools/arch/powerpc/include/uapi/asm/kvm.h arch/powerpc/include/uapi/asm/kvm.h Please see tools/include/uapi/README for details (it's in the first patch of this series). Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: kvm@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-07-25Merge tag 'net-6.11-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf and netfilter. A lot of networking people were at a conference last week, busy catching COVID, so relatively short PR. Current release - regressions: - tcp: process the 3rd ACK with sk_socket for TFO and MPTCP Current release - new code bugs: - l2tp: protect session IDR and tunnel session list with one lock, make sure the state is coherent to avoid a warning - eth: bnxt_en: update xdp_rxq_info in queue restart logic - eth: airoha: fix location of the MBI_RX_AGE_SEL_MASK field Previous releases - regressions: - xsk: require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len, the field reuses previously un-validated pad Previous releases - always broken: - tap/tun: drop short frames to prevent crashes later in the stack - eth: ice: add a per-VF limit on number of FDIR filters - af_unix: disable MSG_OOB handling for sockets in sockmap/sockhash" * tag 'net-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (34 commits) tun: add missing verification for short frame tap: add missing verification for short frame mISDN: Fix a use after free in hfcmulti_tx() gve: Fix an edge case for TSO skb validity check bnxt_en: update xdp_rxq_info in queue restart logic tcp: process the 3rd ACK with sk_socket for TFO/MPTCP selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len bpf: Fix a segment issue when downgrading gso_size net: mediatek: Fix potential NULL pointer dereference in dummy net_device handling MAINTAINERS: make Breno the netconsole maintainer MAINTAINERS: Update bonding entry net: nexthop: Initialize all fields in dumped nexthops net: stmmac: Correct byte order of perfect_match selftests: forwarding: skip if kernel not support setting bridge fdb learning limit tipc: Return non-zero value from tipc_udp_addr2str() on error netfilter: nft_set_pipapo_avx2: disable softinterrupts ice: Fix recipe read procedure ice: Add a per-VF limit on number of FDIR filters net: bonding: correctly annotate RCU in bond_should_notify_peers() ...
2024-07-25selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata testStanislav Fomichev
This flag is now required to use tx_metadata_len. Fixes: 40808a237d9c ("selftests/bpf: Add TX side to xdp_metadata") Reported-by: Julian Schindel <mail@arctic-alpaca.de> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20240713015253.121248-3-sdf@fomichev.me
2024-07-24Merge tag 'random-6.11-rc1-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: "This adds getrandom() support to the vDSO. First, it adds a new kind of mapping to mmap(2), MAP_DROPPABLE, which lets the kernel zero out pages anytime under memory pressure, which enables allocating memory that never gets swapped to disk but also doesn't count as being mlocked. Then, the vDSO implementation of getrandom() is introduced in a generic manner and hooked into random.c. Next, this is implemented on x86. (Also, though it's not ready for this pull, somebody has begun an arm64 implementation already) Finally, two vDSO selftests are added. There are also two housekeeping cleanup commits" * tag 'random-6.11-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: MAINTAINERS: add random.h headers to RNG subsection random: note that RNDGETPOOL was removed in 2.6.9-rc2 selftests/vDSO: add tests for vgetrandom x86: vdso: Wire up getrandom() vDSO implementation random: introduce generic vDSO getrandom() implementation mm: add MAP_DROPPABLE for designating always lazily freeable mappings
2024-07-21Merge tag 'mm-stable-2024-07-21-14-50' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - In the series "mm: Avoid possible overflows in dirty throttling" Jan Kara addresses a couple of issues in the writeback throttling code. These fixes are also targetted at -stable kernels. - Ryusuke Konishi's series "nilfs2: fix potential issues related to reserved inodes" does that. This should actually be in the mm-nonmm-stable tree, along with the many other nilfs2 patches. My bad. - More folio conversions from Kefeng Wang in the series "mm: convert to folio_alloc_mpol()" - Kemeng Shi has sent some cleanups to the writeback code in the series "Add helper functions to remove repeated code and improve readability of cgroup writeback" - Kairui Song has made the swap code a little smaller and a little faster in the series "mm/swap: clean up and optimize swap cache index". - In the series "mm/memory: cleanly support zeropage in vm_insert_page*(), vm_map_pages*() and vmf_insert_mixed()" David Hildenbrand has reworked the rather sketchy handling of the use of the zeropage in MAP_SHARED mappings. I don't see any runtime effects here - more a cleanup/understandability/maintainablity thing. - Dev Jain has improved selftests/mm/va_high_addr_switch.c's handling of higher addresses, for aarch64. The (poorly named) series is "Restructure va_high_addr_switch". - The core TLB handling code gets some cleanups and possible slight optimizations in Bang Li's series "Add update_mmu_tlb_range() to simplify code". - Jane Chu has improved the handling of our fake-an-unrecoverable-memory-error testing feature MADV_HWPOISON in the series "Enhance soft hwpoison handling and injection". - Jeff Johnson has sent a billion patches everywhere to add MODULE_DESCRIPTION() to everything. Some landed in this pull. - In the series "mm: cleanup MIGRATE_SYNC_NO_COPY mode", Kefeng Wang has simplified migration's use of hardware-offload memory copying. - Yosry Ahmed performs more folio API conversions in his series "mm: zswap: trivial folio conversions". - In the series "large folios swap-in: handle refault cases first", Chuanhua Han inches us forward in the handling of large pages in the swap code. This is a cleanup and optimization, working toward the end objective of full support of large folio swapin/out. - In the series "mm,swap: cleanup VMA based swap readahead window calculation", Huang Ying has contributed some cleanups and a possible fixlet to his VMA based swap readahead code. - In the series "add mTHP support for anonymous shmem" Baolin Wang has taught anonymous shmem mappings to use multisize THP. By default this is a no-op - users must opt in vis sysfs controls. Dramatic improvements in pagefault latency are realized. - David Hildenbrand has some cleanups to our remaining use of page_mapcount() in the series "fs/proc: move page_mapcount() to fs/proc/internal.h". - David also has some highmem accounting cleanups in the series "mm/highmem: don't track highmem pages manually". - Build-time fixes and cleanups from John Hubbard in the series "cleanups, fixes, and progress towards avoiding "make headers"". - Cleanups and consolidation of the core pagemap handling from Barry Song in the series "mm: introduce pmd|pte_needs_soft_dirty_wp helpers and utilize them". - Lance Yang's series "Reclaim lazyfree THP without splitting" has reduced the latency of the reclaim of pmd-mapped THPs under fairly common circumstances. A 10x speedup is seen in a microbenchmark. It does this by punting to aother CPU but I guess that's a win unless all CPUs are pegged. - hugetlb_cgroup cleanups from Xiu Jianfeng in the series "mm/hugetlb_cgroup: rework on cftypes". - Miaohe Lin's series "Some cleanups for memory-failure" does just that thing. - Someone other than SeongJae has developed a DAMON feature in Honggyu Kim's series "DAMON based tiered memory management for CXL memory". This adds DAMON features which may be used to help determine the efficiency of our placement of CXL/PCIe attached DRAM. - DAMON user API centralization and simplificatio work in SeongJae Park's series "mm/damon: introduce DAMON parameters online commit function". - In the series "mm: page_type, zsmalloc and page_mapcount_reset()" David Hildenbrand does some maintenance work on zsmalloc - partially modernizing its use of pageframe fields. - Kefeng Wang provides more folio conversions in the series "mm: remove page_maybe_dma_pinned() and page_mkclean()". - More cleanup from David Hildenbrand, this time in the series "mm/memory_hotplug: use PageOffline() instead of PageReserved() for !ZONE_DEVICE". It "enlightens memory hotplug more about PageOffline() pages" and permits the removal of some virtio-mem hacks. - Barry Song's series "mm: clarify folio_add_new_anon_rmap() and __folio_add_anon_rmap()" is a cleanup to the anon folio handling in preparation for mTHP (multisize THP) swapin. - Kefeng Wang's series "mm: improve clear and copy user folio" implements more folio conversions, this time in the area of large folio userspace copying. - The series "Docs/mm/damon/maintaier-profile: document a mailing tool and community meetup series" tells people how to get better involved with other DAMON developers. From SeongJae Park. - A large series ("kmsan: Enable on s390") from Ilya Leoshkevich does that. - David Hildenbrand sends along more cleanups, this time against the migration code. The series is "mm/migrate: move NUMA hinting fault folio isolation + checks under PTL". - Jan Kara has found quite a lot of strangenesses and minor errors in the readahead code. He addresses this in the series "mm: Fix various readahead quirks". - SeongJae Park's series "selftests/damon: test DAMOS tried regions and {min,max}_nr_regions" adds features and addresses errors in DAMON's self testing code. - Gavin Shan has found a userspace-triggerable WARN in the pagecache code. The series "mm/filemap: Limit page cache size to that supported by xarray" addresses this. The series is marked cc:stable. - Chengming Zhou's series "mm/ksm: cmp_and_merge_page() optimizations and cleanup" cleans up and slightly optimizes KSM. - Roman Gushchin has separated the memcg-v1 and memcg-v2 code - lots of code motion. The series (which also makes the memcg-v1 code Kconfigurable) are "mm: memcg: separate legacy cgroup v1 code and put under config option" and "mm: memcg: put cgroup v1-specific memcg data under CONFIG_MEMCG_V1" - Dan Schatzberg's series "Add swappiness argument to memory.reclaim" adds an additional feature to this cgroup-v2 control file. - The series "Userspace controls soft-offline pages" from Jiaqi Yan permits userspace to stop the kernel's automatic treatment of excessive correctable memory errors. In order to permit userspace to monitor and handle this situation. - Kefeng Wang's series "mm: migrate: support poison recover from migrate folio" teaches the kernel to appropriately handle migration from poisoned source folios rather than simply panicing. - SeongJae Park's series "Docs/damon: minor fixups and improvements" does those things. - In the series "mm/zsmalloc: change back to per-size_class lock" Chengming Zhou improves zsmalloc's scalability and memory utilization. - Vivek Kasireddy's series "mm/gup: Introduce memfd_pin_folios() for pinning memfd folios" makes the GUP code use FOLL_PIN rather than bare refcount increments. So these paes can first be moved aside if they reside in the movable zone or a CMA block. - Andrii Nakryiko has added a binary ioctl()-based API to /proc/pid/maps for much faster reading of vma information. The series is "query VMAs from /proc/<pid>/maps". - In the series "mm: introduce per-order mTHP split counters" Lance Yang improves the kernel's presentation of developer information related to multisize THP splitting. - Michael Ellerman has developed the series "Reimplement huge pages without hugepd on powerpc (8xx, e500, book3s/64)". This permits userspace to use all available huge page sizes. - In the series "revert unconditional slab and page allocator fault injection calls" Vlastimil Babka removes a performance-affecting and not very useful feature from slab fault injection. * tag 'mm-stable-2024-07-21-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (411 commits) mm/mglru: fix ineffective protection calculation mm/zswap: fix a white space issue mm/hugetlb: fix kernel NULL pointer dereference when migrating hugetlb folio mm/hugetlb: fix possible recursive locking detected warning mm/gup: clear the LRU flag of a page before adding to LRU batch mm/numa_balancing: teach mpol_to_str about the balancing mode mm: memcg1: convert charge move flags to unsigned long long alloc_tag: fix page_ext_get/page_ext_put sequence during page splitting lib: reuse page_ext_data() to obtain codetag_ref lib: add missing newline character in the warning message mm/mglru: fix overshooting shrinker memory mm/mglru: fix div-by-zero in vmpressure_calc_level() mm/kmemleak: replace strncpy() with strscpy() mm, page_alloc: put should_fail_alloc_page() back behing CONFIG_FAIL_PAGE_ALLOC mm, slab: put should_failslab() back behind CONFIG_SHOULD_FAILSLAB mm: ignore data-race in __swap_writepage hugetlbfs: ensure generic_hugetlb_get_unmapped_area() returns higher address than mmap_min_addr mm: shmem: rename mTHP shmem counters mm: swap_state: use folio_alloc_mpol() in __read_swap_cache_async() mm/migrate: putback split folios when numa hint migration fails ...
2024-07-20Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm updates from Paolo Bonzini: "ARM: - Initial infrastructure for shadow stage-2 MMUs, as part of nested virtualization enablement - Support for userspace changes to the guest CTR_EL0 value, enabling (in part) migration of VMs between heterogenous hardware - Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of the protocol - FPSIMD/SVE support for nested, including merged trap configuration and exception routing - New command-line parameter to control the WFx trap behavior under KVM - Introduce kCFI hardening in the EL2 hypervisor - Fixes + cleanups for handling presence/absence of FEAT_TCRX - Miscellaneous fixes + documentation updates LoongArch: - Add paravirt steal time support - Add support for KVM_DIRTY_LOG_INITIALLY_SET - Add perf kvm-stat support for loongarch RISC-V: - Redirect AMO load/store access fault traps to guest - perf kvm stat support - Use guest files for IMSIC virtualization, when available s390: - Assortment of tiny fixes which are not time critical x86: - Fixes for Xen emulation - Add a global struct to consolidate tracking of host values, e.g. EFER - Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the effective APIC bus frequency, because TDX - Print the name of the APICv/AVIC inhibits in the relevant tracepoint - Clean up KVM's handling of vendor specific emulation to consistently act on "compatible with Intel/AMD", versus checking for a specific vendor - Drop MTRR virtualization, and instead always honor guest PAT on CPUs that support self-snoop - Update to the newfangled Intel CPU FMS infrastructure - Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as it reads '0' and writes from userspace are ignored - Misc cleanups x86 - MMU: - Small cleanups, renames and refactoring extracted from the upcoming Intel TDX support - Don't allocate kvm_mmu_page.shadowed_translation for shadow pages that can't hold leafs SPTEs - Unconditionally drop mmu_lock when allocating TDP MMU page tables for eager page splitting, to avoid stalling vCPUs when splitting huge pages - Bug the VM instead of simply warning if KVM tries to split a SPTE that is non-present or not-huge. KVM is guaranteed to end up in a broken state because the callers fully expect a valid SPTE, it's all but dangerous to let more MMU changes happen afterwards x86 - AMD: - Make per-CPU save_area allocations NUMA-aware - Force sev_es_host_save_area() to be inlined to avoid calling into an instrumentable function from noinstr code - Base support for running SEV-SNP guests. API-wise, this includes a new KVM_X86_SNP_VM type, encrypting/measure the initial image into guest memory, and finalizing it before launching it. Internally, there are some gmem/mmu hooks needed to prepare gmem-allocated pages before mapping them into guest private memory ranges This includes basic support for attestation guest requests, enough to say that KVM supports the GHCB 2.0 specification There is no support yet for loading into the firmware those signing keys to be used for attestation requests, and therefore no need yet for the host to provide certificate data for those keys. To support fetching certificate data from userspace, a new KVM exit type will be needed to handle fetching the certificate from userspace. An attempt to define a new KVM_EXIT_COCO / KVM_EXIT_COCO_REQ_CERTS exit type to handle this was introduced in v1 of this patchset, but is still being discussed by community, so for now this patchset only implements a stub version of SNP Extended Guest Requests that does not provide certificate data x86 - Intel: - Remove an unnecessary EPT TLB flush when enabling hardware - Fix a series of bugs that cause KVM to fail to detect nested pending posted interrupts as valid wake eents for a vCPU executing HLT in L2 (with HLT-exiting disable by L1) - KVM: x86: Suppress MMIO that is triggered during task switch emulation Explicitly suppress userspace emulated MMIO exits that are triggered when emulating a task switch as KVM doesn't support userspace MMIO during complex (multi-step) emulation Silently ignoring the exit request can result in the WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to userspace for some other reason prior to purging mmio_needed See commit 0dc902267cb3 ("KVM: x86: Suppress pending MMIO write exits if emulator detects exception") for more details on KVM's limitations with respect to emulated MMIO during complex emulator flows Generic: - Rename the AS_UNMOVABLE flag that was introduced for KVM to AS_INACCESSIBLE, because the special casing needed by these pages is not due to just unmovability (and in fact they are only unmovable because the CPU cannot access them) - New ioctl to populate the KVM page tables in advance, which is useful to mitigate KVM page faults during guest boot or after live migration. The code will also be used by TDX, but (probably) not through the ioctl - Enable halt poll shrinking by default, as Intel found it to be a clear win - Setup empty IRQ routing when creating a VM to avoid having to synchronize SRCU when creating a split IRQCHIP on x86 - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag that arch code can use for hooking both sched_in() and sched_out() - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid truncating a bogus value from userspace, e.g. to help userspace detect bugs - Mark a vCPU as preempted if and only if it's scheduled out while in the KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest memory when retrieving guest state during live migration blackout Selftests: - Remove dead code in the memslot modification stress test - Treat "branch instructions retired" as supported on all AMD Family 17h+ CPUs - Print the guest pseudo-RNG seed only when it changes, to avoid spamming the log for tests that create lots of VMs - Make the PMU counters test less flaky when counting LLC cache misses by doing CLFLUSH{OPT} in every loop iteration" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits) crypto: ccp: Add the SNP_VLEK_LOAD command KVM: x86/pmu: Add kvm_pmu_call() to simplify static calls of kvm_pmu_ops KVM: x86: Introduce kvm_x86_call() to simplify static calls of kvm_x86_ops KVM: x86: Replace static_call_cond() with static_call() KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event x86/sev: Move sev_guest.h into common SEV header KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event KVM: x86: Suppress MMIO that is triggered during task switch emulation KVM: x86/mmu: Clean up make_huge_page_split_spte() definition and intro KVM: x86/mmu: Bug the VM if KVM tries to split a !hugepage SPTE KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory() KVM: x86/mmu: Make kvm_mmu_do_page_fault() return mapped level KVM: x86/mmu: Account pf_{fixed,emulate,spurious} in callers of "do page fault" KVM: x86/mmu: Bump pf_taken stat only in the "real" page fault handler KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory KVM: Document KVM_PRE_FAULT_MEMORY ioctl mm, virt: merge AS_UNMOVABLE and AS_INACCESSIBLE perf kvm: Add kvm-stat for loongarch64 LoongArch: KVM: Add PV steal time support in guest side ...
2024-07-19mm: add MAP_DROPPABLE for designating always lazily freeable mappingsJason A. Donenfeld
The vDSO getrandom() implementation works with a buffer allocated with a new system call that has certain requirements: - It shouldn't be written to core dumps. * Easy: VM_DONTDUMP. - It should be zeroed on fork. * Easy: VM_WIPEONFORK. - It shouldn't be written to swap. * Uh-oh: mlock is rlimited. * Uh-oh: mlock isn't inherited by forks. - It shouldn't reserve actual memory, but it also shouldn't crash when page faulting in memory if none is available * Uh-oh: VM_NORESERVE means segfaults. It turns out that the vDSO getrandom() function has three really nice characteristics that we can exploit to solve this problem: 1) Due to being wiped during fork(), the vDSO code is already robust to having the contents of the pages it reads zeroed out midway through the function's execution. 2) In the absolute worst case of whatever contingency we're coding for, we have the option to fallback to the getrandom() syscall, and everything is fine. 3) The buffers the function uses are only ever useful for a maximum of 60 seconds -- a sort of cache, rather than a long term allocation. These characteristics mean that we can introduce VM_DROPPABLE, which has the following semantics: a) It never is written out to swap. b) Under memory pressure, mm can just drop the pages (so that they're zero when read back again). c) It is inherited by fork. d) It doesn't count against the mlock budget, since nothing is locked. e) If there's not enough memory to service a page fault, it's not fatal, and no signal is sent. This way, allocations used by vDSO getrandom() can use: VM_DROPPABLE | VM_DONTDUMP | VM_WIPEONFORK | VM_NORESERVE And there will be no problem with OOMing, crashing on overcommitment, using memory when not in use, not wiping on fork(), coredumps, or writing out to swap. In order to let vDSO getrandom() use this, expose these via mmap(2) as MAP_DROPPABLE. Note that this involves removing the MADV_FREE special case from sort_folio(), which according to Yu Zhao is unnecessary and will simply result in an extra call to shrink_folio_list() in the worst case. The chunk removed reenables the swapbacked flag, which we don't want for VM_DROPPABLE, and we can't conditionalize it here because there isn't a vma reference available. Finally, the provided self test ensures that this is working as desired. Cc: linux-mm@kvack.org Acked-by: David Hildenbrand <david@redhat.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-07-12tools: sync uapi/linux/fs.h header into tools subdirAndrii Nakryiko
We need this UAPI header in tools/include subdirectory for using it from BPF selftests. Link: https://lkml.kernel.org/r/20240627170900.1672542-6-andrii@kernel.org Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mike Rapoport (IBM) <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-12Merge tag 'loongarch-kvm-6.11' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD LoongArch KVM changes for v6.11 1. Add ParaVirt steal time support. 2. Add some VM migration enhancement. 3. Add perf kvm-stat support for loongarch.
2024-07-12KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORYIsaku Yamahata
Add a test case to exercise KVM_PRE_FAULT_MEMORY and run the guest to access the pre-populated area. It tests KVM_PRE_FAULT_MEMORY ioctl for KVM_X86_DEFAULT_VM and KVM_X86_SW_PROTECTED_VM. Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Message-ID: <32427791ef42e5efaafb05d2ac37fa4372715f47.1712785629.git.isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-07-09Merge tag 'for-netdev' of ↵Paolo Abeni
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2024-07-08 The following pull-request contains BPF updates for your *net-next* tree. We've added 102 non-merge commits during the last 28 day(s) which contain a total of 127 files changed, 4606 insertions(+), 980 deletions(-). The main changes are: 1) Support resilient split BTF which cuts down on duplication and makes BTF as compact as possible wrt BTF from modules, from Alan Maguire & Eduard Zingerman. 2) Add support for dumping kfunc prototypes from BTF which enables both detecting as well as dumping compilable prototypes for kfuncs, from Daniel Xu. 3) Batch of s390x BPF JIT improvements to add support for BPF arena and to implement support for BPF exceptions, from Ilya Leoshkevich. 4) Batch of riscv64 BPF JIT improvements in particular to add 12-argument support for BPF trampolines and to utilize bpf_prog_pack for the latter, from Pu Lehui. 5) Extend BPF test infrastructure to add a CHECKSUM_COMPLETE validation option for skbs and add coverage along with it, from Vadim Fedorenko. 6) Inline bpf_get_current_task/_btf() helpers in the arm64 BPF JIT which gives a small 1% performance improvement in micro-benchmarks, from Puranjay Mohan. 7) Extend the BPF verifier to track the delta between linked registers in order to better deal with recent LLVM code optimizations, from Alexei Starovoitov. 8) Fix bpf_wq_set_callback_impl() kfunc signature where the third argument should have been a pointer to the map value, from Benjamin Tissoires. 9) Extend BPF selftests to add regular expression support for test output matching and adjust some of the selftest when compiled under gcc, from Cupertino Miranda. 10) Simplify task_file_seq_get_next() and remove an unnecessary loop which always iterates exactly once anyway, from Dan Carpenter. 11) Add the capability to offload the netfilter flowtable in XDP layer through kfuncs, from Florian Westphal & Lorenzo Bianconi. 12) Various cleanups in networking helpers in BPF selftests to shave off a few lines of open-coded functions on client/server handling, from Geliang Tang. 13) Properly propagate prog->aux->tail_call_reachable out of BPF verifier, so that x86 JIT does not need to implement detection, from Leon Hwang. 14) Fix BPF verifier to add a missing check_func_arg_reg_off() to prevent an out-of-bounds memory access for dynpointers, from Matt Bobrowski. 15) Fix bpf_session_cookie() kfunc to return __u64 instead of long pointer as it might lead to problems on 32-bit archs, from Jiri Olsa. 16) Enhance traffic validation and dynamic batch size support in xsk selftests, from Tushar Vyavahare. bpf-next-for-netdev * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (102 commits) selftests/bpf: DENYLIST.aarch64: Remove fexit_sleep selftests/bpf: amend for wrong bpf_wq_set_callback_impl signature bpf: helpers: fix bpf_wq_set_callback_impl signature libbpf: Add NULL checks to bpf_object__{prev_map,next_map} selftests/bpf: Remove exceptions tests from DENYLIST.s390x s390/bpf: Implement exceptions s390/bpf: Change seen_reg to a mask bpf: Remove unnecessary loop in task_file_seq_get_next() riscv, bpf: Optimize stack usage of trampoline bpf, devmap: Add .map_alloc_check selftests/bpf: Remove arena tests from DENYLIST.s390x selftests/bpf: Add UAF tests for arena atomics selftests/bpf: Introduce __arena_global s390/bpf: Support arena atomics s390/bpf: Enable arena s390/bpf: Support address space cast instruction s390/bpf: Support BPF_PROBE_MEM32 s390/bpf: Land on the next JITed instruction after exception s390/bpf: Introduce pre- and post- probe functions s390/bpf: Get rid of get_probe_mem_regno() ... ==================== Link: https://patch.msgid.link/20240708221438.10974-1-daniel@iogearbox.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>