linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2023-10-12	KVM: selftests: Force load all supported XSAVE state in state test	Sean Christopherson
	Extend x86's state to forcefully load all host-supported xfeatures by modifying xstate_bv in the saved state. Stuffing xstate_bv ensures that the selftest is verifying KVM's full ABI regardless of whether or not the guest code is successful in getting various xfeatures out of their INIT state, e.g. see the disaster that is/was MPX. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20230928001956.924301-6-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-12	KVM: selftests: Load XSAVE state into untouched vCPU during state test	Sean Christopherson
	Expand x86's state test to load XSAVE state into a "dummy" vCPU prior to KVM_SET_CPUID2, and again with an empty guest CPUID model. Except for off-by-default features, i.e. AMX, KVM's ABI for KVM_SET_XSAVE is that userspace is allowed to load xfeatures so long as they are supported by the host. This is a regression test for a combination of KVM bugs where the state saved by KVM_GET_XSAVE{2} could not be loaded via KVM_SET_XSAVE if the saved xstate_bv would load guest-unsupported xfeatures. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20230928001956.924301-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-12	KVM: selftests: Touch relevant XSAVE state in guest for state test	Sean Christopherson
	Modify support XSAVE state in the "state test's" guest code so that saving and loading state via KVM_{G,S}ET_XSAVE actually does something useful, i.e. so that xstate_bv in XSAVE state isn't empty. Punt on BNDCSR for now, it's easier to just stuff that xfeature from the host side. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20230928001956.924301-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-12	KVM: x86: Constrain guest-supported xfeatures only at KVM_GET_XSAVE{2}	Sean Christopherson
	Mask off xfeatures that aren't exposed to the guest only when saving guest state via KVM_GET_XSAVE{2} instead of modifying user_xfeatures directly. Preserving the maximal set of xfeatures in user_xfeatures restores KVM's ABI for KVM_SET_XSAVE, which prior to commit ad856280ddea ("x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0") allowed userspace to load xfeatures that are supported by the host, irrespective of what xfeatures are exposed to the guest. There is no known use case where userspace intentionally loads xfeatures that aren't exposed to the guest, but the bug fixed by commit ad856280ddea was specifically that KVM_GET_SAVE{2} would save xfeatures that weren't exposed to the guest, e.g. would lead to userspace unintentionally loading guest-unsupported xfeatures when live migrating a VM. Restricting KVM_SET_XSAVE to guest-supported xfeatures is especially problematic for QEMU-based setups, as QEMU has a bug where instead of terminating the VM if KVM_SET_XSAVE fails, QEMU instead simply stops loading guest state, i.e. resumes the guest after live migration with incomplete guest state, and ultimately results in guest data corruption. Note, letting userspace restore all host-supported xfeatures does not fix setups where a VM is migrated from a host without commit ad856280ddea, to a target with a subset of host-supported xfeatures. However there is no way to safely address that scenario, e.g. KVM could silently drop the unsupported features, but that would be a clear violation of KVM's ABI and so would require userspace to opt-in, at which point userspace could simply be updated to sanitize the to-be-loaded XSAVE state. Reported-by: Tyler Stachecki <stachecki.tyler@gmail.com> Closes: https://lore.kernel.org/all/20230914010003.358162-1-tstachecki@bloomberg.net Fixes: ad856280ddea ("x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0") Cc: stable@vger.kernel.org Cc: Leonardo Bras <leobras@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Message-Id: <20230928001956.924301-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-12	x86/fpu: Allow caller to constrain xfeatures when copying to uabi buffer	Sean Christopherson
	Plumb an xfeatures mask into __copy_xstate_to_uabi_buf() so that KVM can constrain which xfeatures are saved into the userspace buffer without having to modify the user_xfeatures field in KVM's guest_fpu state. KVM's ABI for KVM_GET_XSAVE{2} is that features that are not exposed to guest must not show up in the effective xstate_bv field of the buffer. Saving only the guest-supported xfeatures allows userspace to load the saved state on a different host with a fewer xfeatures, so long as the target host supports the xfeatures that are exposed to the guest. KVM currently sets user_xfeatures directly to restrict KVM_GET_XSAVE{2} to the set of guest-supported xfeatures, but doing so broke KVM's historical ABI for KVM_SET_XSAVE, which allows userspace to load any xfeatures that are supported by the host. Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20230928001956.924301-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-12	Merge tag 'kvm-s390-master-6.6-1' of ↵	Paolo Bonzini
	git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD One small fix for gisa to avoid stalls.
2023-10-12	smb: client: prevent new fids from being removed by laundromat	Paulo Alcantara
	Check if @cfid->time is set in laundromat so we guarantee that only fully cached fids will be selected for removal. While we're at it, add missing locks to protect access of @cfid fields in order to avoid races with open_cached_dir() and cfids_laundromat_worker(), respectively. Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-10-12	smb: client: make laundromat a delayed worker	Paulo Alcantara
	By having laundromat kthread processing cached directories on every second turned out to be overkill, especially when having multiple SMB mounts. Relax it by using a delayed worker instead that gets scheduled on every @dir_cache_timeout (default=30) seconds per tcon. This also fixes the 1s delay when tearing down tcon. Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-10-12	net: gso_test: fix build with gcc-12 and earlier	Florian Westphal
	gcc 12 errors out with: net/core/gso_test.c:58:48: error: initializer element is not constant 58 \| .segs = (const unsigned int[]) { gso_size }, This version isn't old (2022), so switch to preprocessor-bsaed constant instead of 'static const int'. Cc: Willem de Bruijn <willemb@google.com> Reported-by: Tasmiya Nalatwad <tasmiya@linux.vnet.ibm.com> Closes: https://lore.kernel.org/netdev/79fbe35c-4dd1-4f27-acb2-7a60794bc348@linux.vnet.ibm.com/ Fixes: 1b4fa28a8b07 ("net: parametrize skb_segment unit test to expand coverage") Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20231012120901.10765-1-fw@strlen.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-12	nfp: add support CHACHA20-POLY1305 offload for ipsec	Shihong Wang
	Add the configuration of CHACHA20-POLY1305 to the driver and send the message to hardware so that the NIC supports the algorithm. Signed-off-by: Shihong Wang <shihong.wang@corigine.com> Signed-off-by: Louis Peens <louis.peens@corigine.com> Link: https://lore.kernel.org/r/20231009080946.7655-2-louis.peens@corigine.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-12	riscv: signal: fix sigaltstack frame size checking	Andy Chiu
	The alternative stack checking in get_sigframe introduced by the Vector support is not needed and has a problem. It is not needed as we have already validate it at the beginning of the function if we are already on an altstack. If not, the size of an altstack is always validated at its allocation stage with sigaltstack_size_valid(). Besides, we must only regard the size of an altstack if the handler of a signal is registered with SA_ONSTACK. So, blindly checking overflow of an altstack if sas_ss_size not equals to zero will check against wrong signal handlers if only a subset of signals are registered with SA_ONSTACK. Fixes: 8ee0b41898fa ("riscv: signal: Add sigcontext save/restore for vector") Reported-by: Prashanth Swaminathan <prashanthsw@google.com> Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Link: https://lore.kernel.org/r/20230822164904.21660-1-andy.chiu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-10-12	wifi: rtw89: coex: add annotation __counted_by() to struct ↵	Ping-Ke Shih
	rtw89_btc_btf_set_mon_reg Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). Use struct_size() and flex_array_size() helpers to calculate proper sizes for allocation and memcpy(). Don't change logic at all, and result is identical as before. Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231011063725.25276-2-pkshih@realtek.com
2023-10-12	wifi: rtw89: coex: add annotation __counted_by() for struct ↵	Ping-Ke Shih
	rtw89_btc_btf_set_slot_table Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). Use struct_size() and flex_array_size() helpers to calculate proper sizes for allocation and memcpy(). Don't change logic at all, and result is identical as before. Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231011063725.25276-1-pkshih@realtek.com
2023-10-12	wifi: rtw89: add EHT radiotap in monitor mode	Ping-Ke Shih
	Add IEEE80211_RADIOTAP_EHT and IEEE80211_RADIOTAP_EHT_USIG radiotap to fill basic EHT NSS, MCS, GI and bandwidth. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231011115256.6121-7-pkshih@realtek.com
2023-10-12	wifi: rtw89: show EHT rate in debugfs	Ping-Ke Shih
	Since we have TX rate from RA report of C2H event and RX rate from RX descriptor, show them in debugfs like TX rate [1]: EHT 2SS MCS-7 GI:3.2 BW:80 (hw_rate=0x427) RX rate [1]: EHT 2SS MCS-7 GI:3.2 BW:80 (hw_rate=0x427) Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231011115256.6121-6-pkshih@realtek.com
2023-10-12	wifi: rtw89: parse TX EHT rate selected by firmware from RA C2H report	Ping-Ke Shih
	RA (rate adaptive) C2H report is to reflect current TX rate firmware is using. Parse C2H event encoded in EHT mode, and then user space and debugfs can use the information to know TX rate. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231011115256.6121-5-pkshih@realtek.com
2023-10-12	wifi: rtw89: Add EHT rate mask as parameters of RA H2C command	Ping-Ke Shih
	Set EHT rate mask to RA (rate adaptive) H2C command according to handshake result. The EHT rate mask format looks like 44 28 12 4 0 +----------------+----------------+--------+----+ \| EHT 2SS rate \| EHT 1SS rate \| OFDM \| CCK\| +----------------+----------------+--------+----+ Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231011115256.6121-4-pkshih@realtek.com
2023-10-12	wifi: rtw89: parse EHT information from RX descriptor and PPDU status packet	Ping-Ke Shih
	There are two kinds of RX packets -- normal and its PPDU status packet. Both have RX descriptor containing some information such as rate, GI and bandwidth, and we use these information to find the relationship between two kinds of packets. Then, we can get more information like RSSI and EVM from PPDU status packet. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231011115256.6121-3-pkshih@realtek.com
2023-10-12	wifi: radiotap: add bandwidth definition of EHT U-SIG	Ping-Ke Shih
	Define EHT U-SIG bandwidth used by radiotap according to Table 36-28 "U-SIG field of an EHT MU PPDU" in 802.11be (D3.0). Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231011115256.6121-2-pkshih@realtek.com
2023-10-12	wifi: rtlwifi: use convenient list_count_nodes()	Dmitry Antipov
	Simplify 'rtl92ee_dm_common_info_self_update()', 'rtl8723be_dm_common_info_self_update()', and 'rtl8821ae_dm_common_info_self_update()' by using 'list_count_nodes()'. Compile tested only. Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Acked-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231011045227.7989-1-dmantipov@yandex.ru
2023-10-12	wifi: p54: Annotate struct p54_cal_database with __counted_by	Kees Cook
	Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). Add __counted_by for struct p54_cal_database. Cc: Christian Lamparter <chunkeey@googlemail.com> Cc: Kalle Valo <kvalo@kernel.org> Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org> Cc: linux-wireless@vger.kernel.org Cc: linux-hardening@vger.kernel.org Suggested-by: Jason Andryuk <jandryuk@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Jason Andryuk <jandryuk@gmail.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231009161028.it.544-kees@kernel.org
2023-10-12	IXP4xx MAINTAINERS entries	Krzysztof Hałasa
	Update MAINTAINERS entries for Intel IXP4xx SoCs. Linus has been handling all IXP4xx stuff since 2019 or so. Signed-off-by: Krzysztof Hałasa <khalasa@piap.pl> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Deepak Saxena <dsaxena@plexity.net> Link: https://lore.kernel.org/r/m3ttqxu4ru.fsf@t19.piap.pl Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-10-12	Merge branch 'rswitch-fix-issues-on-specific-conditions'	Paolo Abeni
	Yoshihiro Shimoda says: ==================== rswitch: Fix issues on specific conditions This patch series fix some issues of rswitch driver on specific condtions. ==================== Link: https://lore.kernel.org/r/20231010124858.183891-1-yoshihiro.shimoda.uh@renesas.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-12	rswitch: Fix imbalance phy_power_off() calling	Yoshihiro Shimoda
	The phy_power_off() should not be called if phy_power_on() failed. So, add a condition .power_count before calls phy_power_off(). Fixes: 5cb630925b49 ("net: renesas: rswitch: Add phy_power_{on,off}() calling") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-12	rswitch: Fix renesas_eth_sw_remove() implementation	Yoshihiro Shimoda
	Fix functions calling order and a condition in renesas_eth_sw_remove(). Otherwise, kernel NULL pointer dereference happens from phy_stop() if a net device opens. Fixes: 3590918b5d07 ("net: ethernet: renesas: Add support for "Ethernet Switch"") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-12	drm/tiny: correctly print `struct resource *` on error	Joey Gouly
	The `res` variable is already a `struct resource `, don't take the address of it. Fixes incorrect output: simple-framebuffer 9e20dc000.framebuffer: [drm] ERROR* could not acquire memory range [??? 0xffff4be88a387d00-0xfffffefffde0a240 flags 0x0]: -16 To be correct: simple-framebuffer 9e20dc000.framebuffer: [drm] ERROR could not acquire memory range [mem 0x9e20dc000-0x9e307bfff flags 0x200]: -16 Signed-off-by: Joey Gouly <joey.gouly@arm.com> Fixes: 9a10c7e6519b ("drm/simpledrm: Add support for system memory framebuffers") Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Thierry Reding <treding@nvidia.com> Cc: Javier Martinez Canillas <javierm@redhat.com> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.3+ Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20231010174652.2439513-1-joey.gouly@arm.com
2023-10-12	drm: Do not overrun array in drm_gem_get_pages()	Matthew Wilcox (Oracle)
	If the shared memory object is larger than the DRM object that it backs, we can overrun the page array. Limit the number of pages we install from each folio to prevent this. Signed-off-by: "Matthew Wilcox (Oracle)" <willy@infradead.org> Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Link: https://lore.kernel.org/lkml/13360591.uLZWGnKmhe@natalenko.name/ Fixes: 3291e09a4638 ("drm: convert drm_gem_put_pages() to use a folio_batch") Cc: stable@vger.kernel.org # 6.5.x Signed-off-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231005135648.2317298-1-willy@infradead.org
2023-10-12	netfilter: nft_payload: fix wrong mac header matching	Florian Westphal
	mcast packets get looped back to the local machine. Such packets have a 0-length mac header, we should treat this like "mac header not set" and abort rule evaluation. As-is, we just copy data from the network header instead. Fixes: 96518518cc41 ("netfilter: add nftables") Reported-by: Blažej Krajňák <krajnak@levonet.sk> Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-12	nf_tables: fix NULL pointer dereference in nft_expr_inner_parse()	Xingyuan Mo
	We should check whether the NFTA_EXPR_NAME netlink attribute is present before accessing it, otherwise a null pointer deference error will occur. Call Trace: <TASK> dump_stack_lvl+0x4f/0x90 print_report+0x3f0/0x620 kasan_report+0xcd/0x110 __asan_load2+0x7d/0xa0 nla_strcmp+0x2f/0x90 __nft_expr_type_get+0x41/0xb0 nft_expr_inner_parse+0xe3/0x200 nft_inner_init+0x1be/0x2e0 nf_tables_newrule+0x813/0x1230 nfnetlink_rcv_batch+0xec3/0x1170 nfnetlink_rcv+0x1e4/0x220 netlink_unicast+0x34e/0x4b0 netlink_sendmsg+0x45c/0x7e0 __sys_sendto+0x355/0x370 __x64_sys_sendto+0x84/0xa0 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Fixes: 3a07327d10a0 ("netfilter: nft_inner: support for inner tunnel header matching") Signed-off-by: Xingyuan Mo <hdthky0@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-12	nf_tables: fix NULL pointer dereference in nft_inner_init()	Xingyuan Mo
	We should check whether the NFTA_INNER_NUM netlink attribute is present before accessing it, otherwise a null pointer deference error will occur. Call Trace: dump_stack_lvl+0x4f/0x90 print_report+0x3f0/0x620 kasan_report+0xcd/0x110 __asan_load4+0x84/0xa0 nft_inner_init+0x128/0x2e0 nf_tables_newrule+0x813/0x1230 nfnetlink_rcv_batch+0xec3/0x1170 nfnetlink_rcv+0x1e4/0x220 netlink_unicast+0x34e/0x4b0 netlink_sendmsg+0x45c/0x7e0 __sys_sendto+0x355/0x370 __x64_sys_sendto+0x84/0xa0 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Fixes: 3a07327d10a0 ("netfilter: nft_inner: support for inner tunnel header matching") Signed-off-by: Xingyuan Mo <hdthky0@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-12	netfilter: nf_tables: do not refresh timeout when resetting element	Pablo Neira Ayuso
	The dump and reset command should not refresh the timeout, this command is intended to allow users to list existing stateful objects and reset them, element expiration should be refresh via transaction instead with a specific command to achieve this, otherwise this is entering combo semantics that will be hard to be undone later (eg. a user asking to retrieve counters but _not_ requiring to refresh expiration). Fixes: 079cd633219d ("netfilter: nf_tables: Introduce NFT_MSG_GETSETELEM_RESET") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-12	netfilter: nf_tables: Annotate struct nft_pipapo_match with __counted_by	Kees Cook
	Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct nft_pipapo_match. Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Jozsef Kadlecsik <kadlec@netfilter.org> Cc: Florian Westphal <fw@strlen.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: netfilter-devel@vger.kernel.org Cc: coreteam@netfilter.org Cc: netdev@vger.kernel.org Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci [1] Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-12	netfilter: nfnetlink_log: silence bogus compiler warning	Florian Westphal
	net/netfilter/nfnetlink_log.c:800:18: warning: variable 'ctinfo' is uninitialized The warning is bogus, the variable is only used if ct is non-NULL and always initialised in that case. Init to 0 too to silence this. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202309100514.ndBFebXN-lkp@intel.com/ Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-12	netfilter: nf_tables: do not remove elements if set backend implements .abort	Pablo Neira Ayuso
	pipapo set backend maintains two copies of the datastructure, removing the elements from the copy that is going to be discarded slows down the abort path significantly, from several minutes to few seconds after this patch. Fixes: 212ed75dc5fb ("netfilter: nf_tables: integrate pipapo into commit protocol") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-12	octeontx2-pf: Fix page pool frag allocation warning	Ratheesh Kannoth
	Since page pool param's "order" is set to 0, will result in below warn message if interface is configured with higher rx buffer size. Steps to reproduce the issue. 1. devlink dev param set pci/0002:04:00.0 name receive_buffer_size \ value 8196 cmode runtime 2. ifconfig eth0 up [ 19.901356] ------------[ cut here ]------------ [ 19.901361] WARNING: CPU: 11 PID: 12331 at net/core/page_pool.c:567 page_pool_alloc_frag+0x3c/0x230 [ 19.901449] pstate: 82401009 (Nzcv daif +PAN -UAO +TCO -DIT +SSBS BTYPE=--) [ 19.901451] pc : page_pool_alloc_frag+0x3c/0x230 [ 19.901453] lr : __otx2_alloc_rbuf+0x60/0xbc [rvu_nicpf] [ 19.901460] sp : ffff80000f66b970 [ 19.901461] x29: ffff80000f66b970 x28: 0000000000000000 x27: 0000000000000000 [ 19.901464] x26: ffff800000d15b68 x25: ffff000195b5c080 x24: ffff0002a5a32dc0 [ 19.901467] x23: ffff0001063c0878 x22: 0000000000000100 x21: 0000000000000000 [ 19.901469] x20: 0000000000000000 x19: ffff00016f781000 x18: 0000000000000000 [ 19.901472] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 [ 19.901474] x14: 0000000000000000 x13: ffff0005ffdc9c80 x12: 0000000000000000 [ 19.901477] x11: ffff800009119a38 x10: 4c6ef2e3ba300519 x9 : ffff800000d13844 [ 19.901479] x8 : ffff0002a5a33cc8 x7 : 0000000000000030 x6 : 0000000000000030 [ 19.901482] x5 : 0000000000000005 x4 : 0000000000000000 x3 : 0000000000000a20 [ 19.901484] x2 : 0000000000001080 x1 : ffff80000f66b9d4 x0 : 0000000000001000 [ 19.901487] Call trace: [ 19.901488] page_pool_alloc_frag+0x3c/0x230 [ 19.901490] __otx2_alloc_rbuf+0x60/0xbc [rvu_nicpf] [ 19.901494] otx2_rq_aura_pool_init+0x1c4/0x240 [rvu_nicpf] [ 19.901498] otx2_open+0x228/0xa70 [rvu_nicpf] [ 19.901501] otx2vf_open+0x20/0xd0 [rvu_nicvf] [ 19.901504] __dev_open+0x114/0x1d0 [ 19.901507] __dev_change_flags+0x194/0x210 [ 19.901510] dev_change_flags+0x2c/0x70 [ 19.901512] devinet_ioctl+0x3a4/0x6c4 [ 19.901515] inet_ioctl+0x228/0x240 [ 19.901518] sock_ioctl+0x2ac/0x480 [ 19.901522] __arm64_sys_ioctl+0x564/0xe50 [ 19.901525] invoke_syscall.constprop.0+0x58/0xf0 [ 19.901529] do_el0_svc+0x58/0x150 [ 19.901531] el0_svc+0x30/0x140 [ 19.901533] el0t_64_sync_handler+0xe8/0x114 [ 19.901535] el0t_64_sync+0x1a0/0x1a4 [ 19.901537] ---[ end trace 678c0bf660ad8116 ]--- Fixes: b2e3406a38f0 ("octeontx2-pf: Add support for page pool") Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com> Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com> Link: https://lore.kernel.org/r/20231010034842.3807816-1-rkannoth@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-12	nfc: nci: assert requested protocol is valid	Jeremy Cline
	The protocol is used in a bit mask to determine if the protocol is supported. Assert the provided protocol is less than the maximum defined so it doesn't potentially perform a shift-out-of-bounds and provide a clearer error for undefined protocols vs unsupported ones. Fixes: 6a2968aaf50c ("NFC: basic NCI protocol implementation") Reported-and-tested-by: syzbot+0839b78e119aae1fec78@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=0839b78e119aae1fec78 Signed-off-by: Jeremy Cline <jeremy@jcline.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20231009200054.82557-1-jeremy@jcline.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-12	af_packet: Fix fortified memcpy() without flex array.	Kuniyuki Iwashima
	Sergei Trofimovich reported a regression [0] caused by commit a0ade8404c3b ("af_packet: Fix warning of fortified memcpy() in packet_getname()."). It introduced a flex array sll_addr_flex in struct sockaddr_ll as a union-ed member with sll_addr to work around the fortified memcpy() check. However, a userspace program uses a struct that has struct sockaddr_ll in the middle, where a flex array is illegal to exist. include/linux/if_packet.h:24:17: error: flexible array member 'sockaddr_ll::<unnamed union>::<unnamed struct>::sll_addr_flex' not at end of 'struct packet_info_t' 24 \| __DECLARE_FLEX_ARRAY(unsigned char, sll_addr_flex); \| ^~~~~~~~~~~~~~~~~~~~ To fix the regression, let's go back to the first attempt [1] telling memcpy() the actual size of the array. Reported-by: Sergei Trofimovich <slyich@gmail.com> Closes: https://github.com/NixOS/nixpkgs/pull/252587#issuecomment-1741733002 [0] Link: https://lore.kernel.org/netdev/20230720004410.87588-3-kuniyu@amazon.com/ [1] Fixes: a0ade8404c3b ("af_packet: Fix warning of fortified memcpy() in packet_getname().") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20231009153151.75688-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-12	pinctrl: renesas: rzn1: Enable missing PINMUX	Ralph Siemsen
	Enable pin muxing (eg. programmable function), so that the RZ/N1 GPIO pins will be configured as specified by the pinmux in the DTS. This used to be enabled implicitly via CONFIG_GENERIC_PINMUX_FUNCTIONS, however that was removed, since the RZ/N1 driver does not call any of the generic pinmux functions. Fixes: 1308fb4e4eae14e6 ("pinctrl: rzn1: Do not select GENERIC_PIN{CTRL_GROUPS,MUX_FUNCTIONS}") Signed-off-by: Ralph Siemsen <ralph.siemsen@linaro.org> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20231004200008.1306798-1-ralph.siemsen@linaro.org Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2023-10-12	xfs: reinstate the old i_version counter as STATX_CHANGE_COOKIE	Jeff Layton
	The handling of STATX_CHANGE_COOKIE was moved into generic_fillattr in commit 0d72b92883c6 (fs: pass the request_mask to generic_fillattr), but we didn't account for the fact that xfs doesn't call generic_fillattr at all. Make XFS report its i_version as the STATX_CHANGE_COOKIE. Fixes: 0d72b92883c6 (fs: pass the request_mask to generic_fillattr) Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2023-10-12	xfs: Remove duplicate include	Jiapeng Chong
	./fs/xfs/scrub/xfile.c: xfs_format.h is included more than once. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=6209 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2023-10-12	xfs: correct calculation for agend and blockcount	Shiyang Ruan
	The agend should be "start + length - 1", then, blockcount should be "end + 1 - start". Correct 2 calculation mistakes. Also, rename "agend" to "range_agend" because it's not the end of the AG per se; it's the end of the dead region within an AG's agblock space. Fixes: 5cf32f63b0f4 ("xfs: fix the calculation for "end" and "length"") Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2023-10-12	Merge tag 'random-fixes-6.6_2023-10-11' of ↵	Chandan Babu R
	https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-6.6-fixesD xfs: random fixes for 6.6 Rollup of a couple of reviewed fixes. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org> * tag 'random-fixes-6.6_2023-10-11' of https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux: xfs: process free extents to busy list in FIFO order xfs: adjust the incore perag block_count when shrinking
2023-10-11	Merge tag 'nf-next-23-10-10' of ↵	Jakub Kicinski
	https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Florian Westphal says: ==================== netfilter updates for next First 5 patches, from Phil Sutter, clean up nftables dumpers to use the context buffer in the netlink_callback structure rather than a kmalloc'd buffer. Patch 6, from myself, zaps dead code and replaces the helper function with a small inlined helper. Patch 7, also from myself, removes another pr_debug and replaces it with the existing nf_log-based debug helpers. Last patch, from George Guo, gets nft_table comments back in sync with the structure members. * tag 'nf-next-23-10-10' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: cleanup struct nft_table netfilter: conntrack: prefer tcp_error_log to pr_debug netfilter: conntrack: simplify nf_conntrack_alter_reply netfilter: nf_tables: Don't allocate nft_rule_dump_ctx netfilter: nf_tables: Carry s_idx in nft_rule_dump_ctx netfilter: nf_tables: Carry reset flag in nft_rule_dump_ctx netfilter: nf_tables: Drop pointless memset when dumping rules netfilter: nf_tables: Always allocate nft_rule_dump_ctx ==================== Link: https://lore.kernel.org/r/20231010145343.12551-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-11	netdev: use napi_schedule bool instead of napi_schedule_prep/__napi_schedule	Christian Marangi
	Replace if condition of napi_schedule_prep/__napi_schedule and use bool from napi_schedule directly where possible. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Link: https://lore.kernel.org/r/20231009133754.9834-5-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-11	net: tc35815: rework network interface interrupt logic	Christian Marangi
	Rework network interface logic. Before this change, the code flow was: 1. Disable interrupt 2. Try to schedule a NAPI 3. Check if it was possible (NAPI is not already scheduled) 4. emit BUG() if we receive interrupt while a NAPI is scheduled If some application busy poll or set gro_flush_timeout low enough, it's possible to reach the BUG() condition. Given that the condition may happen and it wouldn't be a bug, rework the logic to permit such case and prevent stall with interrupt never enabled again. Disable the interrupt only if the NAPI can be scheduled (aka it's not already scheduled) and drop the printk and BUG() call. With these change, in the event of a NAPI already scheduled, the interrupt is simply ignored with nothing done. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Link: https://lore.kernel.org/r/20231009133754.9834-4-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-11	netdev: replace napi_reschedule with napi_schedule	Christian Marangi
	Now that napi_schedule return a bool, we can drop napi_reschedule that does the same exact function. The function comes from a very old commit bfe13f54f502 ("ibm_emac: Convert to use napi_struct independent of struct net_device") and the purpose is actually deprecated in favour of different logic. Convert every user of napi_reschedule to napi_schedule. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com> # ath10k Acked-by: Nick Child <nnac123@linux.ibm.com> # ibm Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for can/dev/rx-offload.c Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20231009133754.9834-3-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-11	netdev: make napi_schedule return bool on NAPI successful schedule	Christian Marangi
	Change napi_schedule to return a bool on NAPI successful schedule. This might be useful for some driver to do additional steps after a NAPI has been scheduled. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20231009133754.9834-2-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-11	netdev: replace simple napi_schedule_prep/__napi_schedule to napi_schedule	Christian Marangi
	Replace drivers that still use napi_schedule_prep/__napi_schedule with napi_schedule helper as it does the same exact check and call. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20231009133754.9834-1-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-11	Merge branch 'Add cgroup sockaddr hooks for unix sockets'	Martin KaFai Lau
	Daan De Meyer says: ==================== Changes since v10: * Removed extra check from bpf_sock_addr_set_sun_path() again in favor of calling unix_validate_addr() everywhere in af_unix.c before calling the hooks. Changes since v9: * Renamed bpf_sock_addr_set_unix_addr() to bpf_sock_addr_set_sun_path() and rennamed arguments to match the new name. * Added an extra check to bpf_sock_addr_set_sun_path() to disallow changing the address of an unnamed unix socket. * Removed unnecessary NULL check on uaddrlen in __cgroup_bpf_run_filter_sock_addr(). Changes since v8: * Added missing test programs to last patch Changes since v7: * Fixed formatting nit in comment * Renamed from cgroup/connectun to cgroup/connect_unix (and similar for all other hooks) Changes since v6: * Actually removed bpf_bind() helper for AF_UNIX hooks. * Fixed merge conflict * Updated comment to mention uaddrlen is read-only for AF_INET[6] * Removed unnecessary forward declaration of struct sock_addr_test * Removed unused BPF_CGROUP_RUN_PROG_UNIX_CONNECT() * Fixed formatting nit reported by checkpatch * Added more information to commit message about recvmsg() on connected socket Changes since v5: * Fixed kernel version in bpftool documentation (6.3 => 6.7). * Added connection mode socket recvmsg() test. * Removed bpf_bind() helper for AF_UNIX hooks. * Added missing getpeernameun and getsocknameun BPF test programs. * Added note for bind() test being unused currently. Changes since v4: * Dropped support for intercepting bind() as when using bind() with unix sockets and a pathname sockaddr, bind() will create an inode in the filesystem that needs to be cleaned up. If the address is rewritten, users might try to clean up the wrong file and leak the actual socket file in the filesystem. * Changed bpf_sock_addr_set_unix_addr() to use BTF_KFUNC_HOOK_CGROUP_SKB instead of BTF_KFUNC_HOOK_COMMON. * Removed unix socket related changes from BPF_CGROUP_PRE_CONNECT_ENABLED() as unix sockets do not support pre-connect. * Added tests for getpeernameun and getsocknameun hooks. * We now disallow an empty sockaddr in bpf_sock_addr_set_unix_addr() similar to unix_validate_addr(). * Removed unnecessary cgroup_bpf_enabled() checks * Removed unnecessary error checks Changes since v3: * Renamed bpf_sock_addr_set_addr() to bpf_sock_addr_set_unix_addr() and made it only operate on AF_UNIX sockaddrs. This is because for the other families, users usually want to configure more than just the address so a generic interface will not fit the bill here. e.g. for AF_INET and AF_INET6, users would generally also want to be able to configure the port which the current interface doesn't support. So we expose an AF_UNIX specific function instead. * Made the tests in the new sock addr tests more generic (similar to test_sock_addr.c), this should make it easier to migrate the other sock addr tests in the future. * Removed the new kfunc hook and attached to BTF_KFUNC_HOOK_COMMON instead * Set uaddrlen to 0 when the family is AF_UNSPEC * Pass in the addrlen to the hook from IPv6 code * Fixed mount directory mkdir() to ignore EEXIST Changes since v2: * Configuring the sock addr is now done via a new kfunc bpf_sock_addr_set() * The addrlen is exposed as u32 in bpf_sock_addr_kern * Selftests are updated to use the new kfunc * Selftests are now added as a new sock_addr test in prog_tests/ * Added BTF_KFUNC_HOOK_SOCK_ADDR for BPF_PROG_TYPE_CGROUP_SOCK_ADDR * __cgroup_bpf_run_filter_sock_addr() now returns the modified addrlen Changes since v1: * Split into multiple patches instead of one single patch * Added unix support for all socket address hooks instead of only connect() * Switched approach to expose the socket address length to the bpf hook instead of recalculating the socket address length in kernelspace to properly support abstract unix socket addresses * Modified socket address hook tests to calculate the socket address length once and pass it around everywhere instead of recalculating the actual unix socket address length on demand. * Added some missing section name tests for getpeername()/getsockname() This patch series extends the cgroup sockaddr hooks to include support for unix sockets. To add support for unix sockets, struct bpf_sock_addr_kern is extended to expose the socket address length to the bpf program. Along with that, a new kfunc bpf_sock_addr_set_unix_addr() is added to safely allow modifying an AF_UNIX sockaddr from bpf programs. I intend to use these new hooks in systemd to reimplement the LogNamespace= feature, which allows running multiple instances of systemd-journald to process the logs of different services. systemd-journald also processes syslog messages, so currently, using log namespaces means all services running in the same log namespace have to live in the same private mount namespace so that systemd can mount the journal namespace's associated syslog socket over /dev/log to properly direct syslog messages from all services running in that log namespace to the correct systemd-journald instance. We want to relax this requirement so that processes running in disjoint mount namespaces can still run in the same log namespace. To achieve this, we can use these new hooks to rewrite the socket address of any connect(), sendto(), ... syscalls to /dev/log to the socket address of the journal namespace's syslog socket instead, which will transparently do the redirection without requiring use of a mount namespace and mounting over /dev/log. Aside from the above usecase, these hooks can more generally be used to transparently redirect unix sockets to different addresses as required by services. ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-10-11	selftests/bpf: Add tests for cgroup unix socket address hooks	Daan De Meyer
	These selftests are written in prog_tests style instead of adding them to the existing test_sock_addr tests. Migrating the existing sock addr tests to prog_tests style is left for future work. This commit adds support for testing bind() sockaddr hooks, even though there's no unix socket sockaddr hook for bind(). We leave this code intact for when the INET and INET6 tests are migrated in the future which do support intercepting bind(). Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com> Link: https://lore.kernel.org/r/20231011185113.140426-10-daan.j.demeyer@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>