linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-07-10	floppy: add missing MODULE_DESCRIPTION() macro	Jeff Johnson
	make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/block/floppy.o Add the missing invocation of the MODULE_DESCRIPTION() macro. Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Reviewed-by: Denis Efremov <efremov@linux.com> Link: https://lore.kernel.org/r/20240602-md-block-floppy-v1-1-bc628ea5eb84@quicinc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-07-10	loop: add missing MODULE_DESCRIPTION() macro	Jeff Johnson
	make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/block/loop.o Add the missing invocation of the MODULE_DESCRIPTION() macro. Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://lore.kernel.org/r/20240602-md-block-loop-v1-1-b9b7e2603e72@quicinc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-07-10	ublk_drv: add missing MODULE_DESCRIPTION() macro	Jeff Johnson
	make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/block/ublk_drv.o Add the missing invocation of the MODULE_DESCRIPTION() macro. Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20240602-md-block-ublk_drv-v1-1-995474cafff0@quicinc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-07-10	xen/blkback: add missing MODULE_DESCRIPTION() macro	Jeff Johnson
	make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/block/xen-blkback/xen-blkback.o Add the missing invocation of the MODULE_DESCRIPTION() macro. Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://lore.kernel.org/r/20240602-md-block-xen-blkback-v1-1-6ff5b58bdee1@quicinc.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-07-10	io_uring/napi: Remove unnecessary s64 cast	Thorsten Blum
	Since the do_div() macro casts the divisor to u32 anyway, remove the unnecessary s64 cast and fix the following Coccinelle/coccicheck warning reported by do_div.cocci: WARNING: do_div() does a 64-by-32 division, please consider using div64_s64 instead Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Link: https://lore.kernel.org/r/20240710010520.384009-2-thorsten.blum@toblux.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-07-10	swiotlb: reduce swiotlb pool lookups	Michael Kelley
	With CONFIG_SWIOTLB_DYNAMIC enabled, each round-trip map/unmap pair in the swiotlb results in 6 calls to swiotlb_find_pool(). In multiple places, the pool is found and used in one function, and then must be found again in the next function that is called because only the tlb_addr is passed as an argument. These are the six call sites: dma_direct_map_page: 1. swiotlb_map -> swiotlb_tbl_map_single -> swiotlb_bounce dma_direct_unmap_page: 2. dma_direct_sync_single_for_cpu -> is_swiotlb_buffer 3. dma_direct_sync_single_for_cpu -> swiotlb_sync_single_for_cpu -> swiotlb_bounce 4. is_swiotlb_buffer 5. swiotlb_tbl_unmap_single -> swiotlb_del_transient 6. swiotlb_tbl_unmap_single -> swiotlb_release_slots Reduce the number of calls by finding the pool at a higher level, and passing it as an argument instead of searching again. A key change is for is_swiotlb_buffer() to return a pool pointer instead of a boolean, and then pass this pool pointer to subsequent swiotlb functions. There are 9 occurrences of is_swiotlb_buffer() used to test if a buffer is a swiotlb buffer before calling a swiotlb function. To reduce code duplication in getting the pool pointer and passing it as an argument, introduce inline wrappers for this pattern. The generated code is essentially unchanged. Since is_swiotlb_buffer() no longer returns a boolean, rename some functions to reflect the change: * swiotlb_find_pool() becomes __swiotlb_find_pool() * is_swiotlb_buffer() becomes swiotlb_find_pool() * is_xen_swiotlb_buffer() becomes xen_swiotlb_find_pool() With these changes, a round-trip map/unmap pair requires only 2 pool lookups (listed using the new names and wrappers): dma_direct_unmap_page: 1. dma_direct_sync_single_for_cpu -> swiotlb_find_pool 2. swiotlb_tbl_unmap_single -> swiotlb_find_pool These changes come from noticing the inefficiencies in a code review, not from performance measurements. With CONFIG_SWIOTLB_DYNAMIC, __swiotlb_find_pool() is not trivial, and it uses an RCU read lock, so avoiding the redundant calls helps performance in a hot path. When CONFIG_SWIOTLB_DYNAMIC is not set, the code size reduction is minimal and the perf benefits are likely negligible, but no harm is done. No functional change is intended. Signed-off-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Petr Tesarik <petr@tesarici.cz> Signed-off-by: Christoph Hellwig <hch@lst.de>
2024-07-10	PCI: qcom: Prevent use of uninitialized data in qcom_pcie_suspend_noirq()	Dan Carpenter
	Smatch complains that "ret" could be uninitialized if "pcie->icc_mem" is NULL and "pm_suspend_target_state == PM_SUSPEND_MEM". Silence this warning by initializing ret to zero. Fixes: 78b5f6f8855e ("PCI: qcom: Add OPP support to scale performance") Link: https://lore.kernel.org/linux-pci/20240708180539.1447307-4-dan.carpenter@linaro.org Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Anders Roxell <anders.roxell@linaro.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-07-10	PCI: qcom: Prevent potential error pointer dereference	Dan Carpenter
	Only call dev_pm_opp_put() if dev_pm_opp_find_freq_exact() succeeds; otherwise it leads to an error pointer dereference. Fixes: 78b5f6f8855e ("PCI: qcom: Add OPP support to scale performance") Link: https://lore.kernel.org/linux-pci/20240708180539.1447307-3-dan.carpenter@linaro.org Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Anders Roxell <anders.roxell@linaro.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-07-10	PCI: qcom: Fix missing error code in qcom_pcie_probe()	Dan Carpenter
	Return a negative error code if dev_pm_opp_find_freq_floor() fails; don't return success. Fixes: 78b5f6f8855e ("PCI: qcom: Add OPP support to scale performance") Link: https://lore.kernel.org/linux-pci/20240708180539.1447307-2-dan.carpenter@linaro.org Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Anders Roxell <anders.roxell@linaro.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-07-10	minixfs: Fix minixfs_rename with HIGHMEM	Matthew Wilcox (Oracle)
	minixfs now uses kmap_local_page(), so we can't call kunmap() to undo it. This one call was missed as part of the commit this fixes. Fixes: 6628f69ee66a (minixfs: Use dir_put_page() in minix_unlink() and minix_rename()) Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20240709195841.1986374-1-willy@infradead.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-07-10	PCI: Give pcim_set_mwi() its own devres cleanup callback	Philipp Stanner
	Managing pci_set_mwi() with devres can easily be done with its own callback, without the necessity to store any state about it in a device-related struct. Remove the MWI state from struct pci_devres. Give pcim_set_mwi() a separate devres cleanup callback. Link: https://lore.kernel.org/r/20240613115032.29098-10-pstanner@redhat.com Signed-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-07-10	PCI: Move struct pci_devres.pinned bit to struct pci_dev	Philipp Stanner
	The bit describing whether the PCI device is currently pinned is stored in struct pci_devres. To clean up and simplify the PCI devres API, it's better if this information is stored in struct pci_dev. This will later permit simplifying pcim_enable_device(). Move the 'pinned' boolean bit to struct pci_dev. Restructure bits in struct pci_dev so the pm / pme fields are next to each other. Link: https://lore.kernel.org/r/20240613115032.29098-9-pstanner@redhat.com Signed-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-07-10	PCI: Remove struct pci_devres.enabled status bit	Philipp Stanner
	The struct pci_devres has a separate boolean to track whether a device is enabled. That, however, can easily be tracked in an agnostic manner through the function pci_is_enabled(). Using it allows for simplifying the PCI devres implementation. Replace the separate 'enabled' status bit from struct pci_devres with calls to pci_is_enabled() at the appropriate places. Link: https://lore.kernel.org/r/20240613115032.29098-8-pstanner@redhat.com Signed-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-07-10	PCI: Document hybrid devres hazards	Philipp Stanner
	These functions: pci_request_region() pci_request_regions() pci_request_regions_exclusive() pci_request_selected_regions() pci_request_selected_regions_exclusive() pci_intx() are "hybrid" functions that are managed if pcim_enable_device() has been called, but unmanaged otherwise. This is confusing and has already caused a bug (in 8558de401b5f ("drm/vboxvideo: use managed pci functions")) because users believe all PCI functions, such as pci_iomap_range(), can become managed that way, which is not the case. Add comments to the relevant functions' docstrings that warn users about this behavior. Link: https://lore.kernel.org/r/20240613115032.29098-7-pstanner@redhat.com Signed-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-07-10	PCI: Add managed pcim_request_region()	Philipp Stanner
	These existing functions: pci_request_region() pci_request_selected_regions() pci_request_selected_regions_exclusive() are "hybrid" functions built on __pci_request_region() and are managed if pcim_enable_device() has been called, but unmanaged otherwise. Add these new functions: pcim_request_region() pcim_request_region_exclusive() These are always managed and use the new pcim_addr_devres tracking infrastructure instead of find_pci_dr() and struct pci_devres.region_mask. Implement the hybrid functions using the new "pure" functions and remove struct pci_devres.region_mask, which is no longer needed. Link: https://lore.kernel.org/r/20240613115032.29098-6-pstanner@redhat.com Signed-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-07-10	PCI: Deprecate pcim_iomap_table(), pcim_iomap_regions_request_all()	Philipp Stanner
	Deprecate pcim_iomap_table(). It returns a pointer to a table of ioremapped BARs, or NULL if it fails. This makes uses like this: addr = pcim_iomap_table(pdev)[0]; problematic because it causes a NULL pointer dereference on failure. Callers should use pcim_iomap() instead. Deprecate pcim_iomap_regions_request_all() because it is built on __pci_request_region() and is managed if pcim_enable_device() has been called, but unmanaged otherwise, which is prone to errors. Callers should either use pcim_iomap_regions() to request and map BARs, or use pcim_request_region() followed by pcim_iomap(). Link: https://lore.kernel.org/r/20240613115032.29098-5-pstanner@redhat.com Signed-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> [bhelgaas: commit log, sphinx markup] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-07-10	PCI: Add managed partial-BAR request and map infrastructure	Philipp Stanner
	The pcim_iomap_devres table tracks entire-BAR mappings, so we can't use it to build a managed version of pci_iomap_range(), which maps partial BARs. Add struct pcim_addr_devres, which can track request and mapping of both entire BARs and partial BARs. Add the following internal devres functions based on struct pcim_addr_devres: pcim_iomap_region() # request & map entire BAR pcim_iounmap_region() # unmap & release entire BAR pcim_request_region() # request entire BAR pcim_release_region() # release entire BAR pcim_request_all_regions() # request all entire BARs pcim_release_all_regions() # release all entire BARs Rework the following public interfaces using the new infrastructure listed above: pcim_iomap() # map partial BAR pcim_iounmap() # unmap partial BAR pcim_iomap_regions() # request & map specified BARs pcim_iomap_regions_request_all() # request all BARs, map specified BARs pcim_iounmap_regions() # unmap & release specified BARs Link: https://lore.kernel.org/r/20240613115032.29098-4-pstanner@redhat.com Signed-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-07-10	PCI: Add devres helpers for iomap table	Philipp Stanner
	The pcim_iomap_devres.table administrated by pcim_iomap_table() has its entries set and unset at several places throughout devres.c using manual iterations which are effectively code duplications. Add pcim_add_mapping_to_legacy_table() and pcim_remove_mapping_from_legacy_table() helper functions and use them where possible. Link: https://lore.kernel.org/r/20240613115032.29098-3-pstanner@redhat.com Signed-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-07-10	PCI: Add and use devres helper for bit masks	Philipp Stanner
	The current devres implementation uses manual shift operations to check whether a bit in a mask is set. The code can be made more readable by writing a small helper function for that. Implement mask_contains_bar() and use it where applicable. Link: https://lore.kernel.org/r/20240613115032.29098-2-pstanner@redhat.com Signed-off-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-07-10	net: phy: aquantia: add support for aqr115c	Bartosz Golaszewski
	Add support for a new model to the Aquantia driver. This PHY supports 2.5 gigabit speeds. The PHY mode is referred to by the manufacturer as Overclocked SGMII (OCSGMII) but this actually is just 2500BASEX without in-band signalling so reuse the existing mode to avoid changing the uAPI. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-10	net: phy: aquantia: wait for the GLOBAL_CFG to start returning real values	Bartosz Golaszewski
	When the PHY is first coming up (or resuming from suspend), it's possible that although the FW status shows as running, we still see zeroes in the GLOBAL_CFG set of registers and cannot determine available modes. Since all models support 10M, add a poll and wait the config to become available. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-10	net: phy: aquantia: wait for FW reset before checking the vendor ID	Bartosz Golaszewski
	Checking the firmware register before it complete the boot process makes no sense, it will report 0 even if FW is available from internal memory. Always wait for FW to boot before continuing or we'll unnecessarily try to load it from nvmem/filesystem and fail. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-10	net: phy: aquantia: rename and export aqr107_wait_reset_complete()	Bartosz Golaszewski
	This function is quite generic in this driver and not limited to aqr107. We will use it outside its current compilation unit soon so rename it and declare it in the header. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-09	bpf: relax zero fixed offset constraint on KF_TRUSTED_ARGS/KF_RCU	Matt Bobrowski
	Currently, BPF kfuncs which accept trusted pointer arguments i.e. those flagged as KF_TRUSTED_ARGS, KF_RCU, or KF_RELEASE, all require an original/unmodified trusted pointer argument to be supplied to them. By original/unmodified, it means that the backing register holding the trusted pointer argument that is to be supplied to the BPF kfunc must have its fixed offset set to zero, or else the BPF verifier will outright reject the BPF program load. However, this zero fixed offset constraint that is currently enforced by the BPF verifier onto BPF kfuncs specifically flagged to accept KF_TRUSTED_ARGS or KF_RCU trusted pointer arguments is rather unnecessary, and can limit their usability in practice. Specifically, it completely eliminates the possibility of constructing a derived trusted pointer from an original trusted pointer. To put it simply, a derived pointer is a pointer which points to one of the nested member fields of the object being pointed to by the original trusted pointer. This patch relaxes the zero fixed offset constraint that is enforced upon BPF kfuncs which specifically accept KF_TRUSTED_ARGS, or KF_RCU arguments. Although, the zero fixed offset constraint technically also applies to BPF kfuncs accepting KF_RELEASE arguments, relaxing this constraint for such BPF kfuncs has subtle and unwanted side-effects. This was discovered by experimenting a little further with an initial version of this patch series [0]. The primary issue with relaxing the zero fixed offset constraint on BPF kfuncs accepting KF_RELEASE arguments is that it'd would open up the opportunity for BPF programs to supply both trusted pointers and derived trusted pointers to them. For KF_RELEASE BPF kfuncs specifically, this could be problematic as resources associated with the backing pointer could be released by the backing BPF kfunc and cause instabilities for the rest of the kernel. With this new fixed offset semantic in-place for BPF kfuncs accepting KF_TRUSTED_ARGS and KF_RCU arguments, we now have more flexibility when it comes to the BPF kfuncs that we're able to introduce moving forward. Early discussions covering the possibility of relaxing the zero fixed offset constraint can be found using the link below. This will provide more context on where all this has stemmed from [1]. Notably, pre-existing tests have been updated such that they provide coverage for the updated zero fixed offset functionality. Specifically, the nested offset test was converted from a negative to positive test as it was already designed to assert zero fixed offset semantics of a KF_TRUSTED_ARGS BPF kfunc. [0] https://lore.kernel.org/bpf/ZnA9ndnXKtHOuYMe@google.com/ [1] https://lore.kernel.org/bpf/ZhkbrM55MKQ0KeIV@google.com/ Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20240709210939.1544011-1-mattbobrowski@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-07-09	Merge branch 'fix-libbpf-bpf-skeleton-forward-backward-compat'	Alexei Starovoitov
	Andrii Nakryiko says: ==================== Fix libbpf BPF skeleton forward/backward compat Fix recently identified (but long standing) bug with handling BPF skeleton forward and backward compatibility. On libbpf side, even though BPF skeleton was always designed to be forward and backwards compatible through recording actual size of constrituents of BPF skeleton itself (map/prog/var skeleton definitions), libbpf implementation did implicitly hard-code those sizes by virtue of using a trivial array access syntax. This issue will only affect libbpf used as a shared library. Statically compiled libbpfs will always be in sync with BPF skeleton, bypassing this problem altogether. This patch set fixes libbpf, but also mitigates the problem for old libbpf versions by teaching bpftool to generate more conservative BPF skeleton, if possible (i.e., if there are no struct_ops maps defined). v1->v2: - fix SOB, add acks, typo fixes (Quentin, Eduard); - improve reporting of skipped map auto-attachment (Alan, Eduard). ==================== Link: https://lore.kernel.org/r/20240708204540.4188946-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-07-09	libbpf: improve old BPF skeleton handling for map auto-attach	Andrii Nakryiko
	Improve how we handle old BPF skeletons when it comes to BPF map auto-attachment. Emit one warn-level message per each struct_ops map that could have been auto-attached, if user provided recent enough BPF skeleton version. Don't spam log if there are no relevant struct_ops maps, though. This should help users realize that they probably need to regenerate BPF skeleton header with more recent bpftool/libbpf-cargo (or whatever other means of BPF skeleton generation). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20240708204540.4188946-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-07-09	libbpf: fix BPF skeleton forward/backward compat handling	Andrii Nakryiko
	BPF skeleton was designed from day one to be extensible. Generated BPF skeleton code specifies actual sizes of map/prog/variable skeletons for that reason and libbpf is supposed to work with newer/older versions correctly. Unfortunately, it was missed that we implicitly embed hard-coded most up-to-date (according to libbpf's version of libbpf.h header used to compile BPF skeleton header) sizes of those structs, which can differ from the actual sizes at runtime when libbpf is used as a shared library. We have a few places were we just index array of maps/progs/vars, which implicitly uses these potentially invalid sizes of structs. This patch aims to fix this problem going forward. Once this lands, we'll backport these changes in Github repo to create patched releases for older libbpfs. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Fixes: d66562fba1ce ("libbpf: Add BPF object skeleton support") Fixes: 430025e5dca5 ("libbpf: Add subskeleton scaffolding") Fixes: 08ac454e258e ("libbpf: Auto-attach struct_ops BPF maps in BPF skeleton") Co-developed-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240708204540.4188946-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-07-09	bpftool: improve skeleton backwards compat with old buggy libbpfs	Andrii Nakryiko
	Old versions of libbpf don't handle varying sizes of bpf_map_skeleton struct correctly. As such, BPF skeleton generated by newest bpftool might not be compatible with older libbpf (though only when libbpf is used as a shared library), even though it, by design, should. Going forward libbpf will be fixed, plus we'll release bug fixed versions of relevant old libbpfs, but meanwhile try to mitigate from bpftool side by conservatively assuming older and smaller definition of bpf_map_skeleton, if possible. Meaning, if there are no struct_ops maps. If there are struct_ops, then presumably user would like to have auto-attaching logic and struct_ops map link placeholders, so use the full bpf_map_skeleton definition in that case. Acked-by: Quentin Monnet <qmo@kernel.org> Co-developed-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20240708204540.4188946-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-07-09	net: ethernet: lantiq_etop: fix double free in detach	Aleksander Jan Bajkowski
	The number of the currently released descriptor is never incremented which results in the same skb being released multiple times. Fixes: 504d4721ee8e ("MIPS: Lantiq: Add ethernet driver") Reported-by: Joe Perches <joe@perches.com> Closes: https://lore.kernel.org/all/fc1bf93d92bb5b2f99c6c62745507cc22f3a7b2d.camel@perches.com/ Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20240708205826.5176-1-olek2@wp.pl Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09	netxen_nic: Use {low,upp}er_32_bits() helpers	Geert Uytterhoeven
	Use the existing {low,upp}er_32_bits() helpers instead of defining custom variants. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/319d4a5313ac75f7bbbb6b230b6802b18075c3e0.1720430602.git.geert+renesas@glider.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09	Merge branch 'mlx5-misc-patches-2023-07-08'	Jakub Kicinski
	Tariq Toukan says: ==================== mlx5 misc patches 2023-07-08 This patchset contains features and small enhancements from the team to the mlx5 core and Eth drivers. ==================== Link: https://patch.msgid.link/20240708080025.1593555-1-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09	net/mlx5e: CT: Initialize err to 0 to avoid warning	Cosmin Ratiu
	It is theoretically possible to return bogus uninitialized values from mlx5_tc_ct_entry_replace_rules, even though in practice this will never be the case as the flow rule will be part of at least the regular ct table or the ct nat table, if not both. But to reduce noise, initialize err to 0. Fixes: 49d37d05f216 ("net/mlx5: CT: Separate CT and CT-NAT tuple entries") Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20240708080025.1593555-11-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09	net/mlx5e: SHAMPO, Add missing aggregate counter	Dragos Tatulea
	When the rx_hds_nodata_packets/bytes counters were added, the aggregate counters were omitted. This patch adds them. Fixes: e95c5b9e8912 ("net/mlx5e: SHAMPO, Add header-only ethtool counters for header data split") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20240708080025.1593555-10-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09	net/mlx5: DR, Remove definer functions from SW Steering API	Yevgeny Kliteynik
	No need to expose definer get/put functions as part of SW Steering API - they are internal functions. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20240708080025.1593555-9-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09	Merge branch 'mlxsw-improvements'	Jakub Kicinski
	Petr Machata says: ==================== mlxsw: Improvements This patchset contains assortments of improvements to the mlxsw driver. Please see individual patches for details. ==================== Link: https://patch.msgid.link/cover.1720447210.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09	mlxsw: pci: Lock configuration space of upstream bridge during reset	Ido Schimmel
	The driver triggers a "Secondary Bus Reset" (SBR) by calling __pci_reset_function_locked() which asserts the SBR bit in the "Bridge Control Register" in the configuration space of the upstream bridge for 2ms. This is done without locking the configuration space of the upstream bridge port, allowing user space to access it concurrently. Linux 6.11 will start warning about such unlocked resets [1][2]: pcieport 0000:00:01.0: unlocked secondary bus reset via: pci_reset_bus_function+0x51c/0x6a0 Avoid the warning and the concurrent access by locking the configuration space of the upstream bridge prior to the reset and unlocking it afterwards. [1] https://lore.kernel.org/all/171711746953.1628941.4692125082286867825.stgit@dwillia2-xfh.jf.intel.com/ [2] https://lore.kernel.org/all/20240531213150.GA610983@bhelgaas/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://patch.msgid.link/9937b0afdb50f2f2825945393c94c093c04a5897.1720447210.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09	mlxsw: core_thermal: Report valid current state during cooling device ↵	Ido Schimmel
	registration Commit 31a0fa0019b0 ("thermal/debugfs: Pass cooling device state to thermal_debug_cdev_add()") changed the thermal core to read the current state of the cooling device as part of the cooling device's registration. This is incompatible with the current implementation of the cooling device operations in mlxsw, leading to initialization failure with errors such as: mlxsw_spectrum 0000:01:00.0: Failed to register cooling device mlxsw_spectrum 0000:01:00.0: cannot register bus device The reason for the failure is that when the get current state operation is invoked the driver tries to derive the index of the cooling device by walking a per thermal zone array and looking for the matching cooling device pointer. However, the pointer is returned from the registration function and therefore only set in the array after the registration. The issue was later fixed by commit 1af89dedc8a5 ("thermal: core: Do not fail cdev registration because of invalid initial state") by not failing the registration of the cooling device if it cannot report a valid current state during registration, although drivers are responsible for ensuring that this will not happen. Therefore, make sure the driver is able to report a valid current state for the cooling device during registration by passing to the registration function a per cooling device private data that already has the cooling device index populated. While at it, call thermal_cooling_device_unregister() unconditionally since the function returns immediately if the cooling device pointer is NULL. Reviewed-by: Vadim Pasternak <vadimp@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://patch.msgid.link/c823c4678b6b7afb902c35b3551c81a053afd110.1720447210.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09	mlxsw: Warn about invalid accesses to array fields	Petr Machata
	A forgotten or buggy variable initialization can cause out-of-bounds access to a register or other item array field. For an overflow, such access would mangle adjacent parts of the register payload. For an underflow, due to all variables being unsigned, the access would likely trample unrelated memory. Since neither is correct, replace these accesses with accesses at the index of 0, and warn about the issue. Suggested-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://patch.msgid.link/b988fb265c2f6c1206fe12d5bfdcfa188b7672d1.1720447210.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09	i40e: Fix XDP program unloading while removing the driver	Michal Kubiak
	The commit 6533e558c650 ("i40e: Fix reset path while removing the driver") introduced a new PF state "__I40E_IN_REMOVE" to block modifying the XDP program while the driver is being removed. Unfortunately, such a change is useful only if the ".ndo_bpf()" callback was called out of the rmmod context because unloading the existing XDP program is also a part of driver removing procedure. In other words, from the rmmod context the driver is expected to unload the XDP program without reporting any errors. Otherwise, the kernel warning with callstack is printed out to dmesg. Example failing scenario: 1. Load the i40e driver. 2. Load the XDP program. 3. Unload the i40e driver (using "rmmod" command). The example kernel warning log: [ +0.004646] WARNING: CPU: 94 PID: 10395 at net/core/dev.c:9290 unregister_netdevice_many_notify+0x7a9/0x870 [...] [ +0.010959] RIP: 0010:unregister_netdevice_many_notify+0x7a9/0x870 [...] [ +0.002726] Call Trace: [ +0.002457] <TASK> [ +0.002119] ? __warn+0x80/0x120 [ +0.003245] ? unregister_netdevice_many_notify+0x7a9/0x870 [ +0.005586] ? report_bug+0x164/0x190 [ +0.003678] ? handle_bug+0x3c/0x80 [ +0.003503] ? exc_invalid_op+0x17/0x70 [ +0.003846] ? asm_exc_invalid_op+0x1a/0x20 [ +0.004200] ? unregister_netdevice_many_notify+0x7a9/0x870 [ +0.005579] ? unregister_netdevice_many_notify+0x3cc/0x870 [ +0.005586] unregister_netdevice_queue+0xf7/0x140 [ +0.004806] unregister_netdev+0x1c/0x30 [ +0.003933] i40e_vsi_release+0x87/0x2f0 [i40e] [ +0.004604] i40e_remove+0x1a1/0x420 [i40e] [ +0.004220] pci_device_remove+0x3f/0xb0 [ +0.003943] device_release_driver_internal+0x19f/0x200 [ +0.005243] driver_detach+0x48/0x90 [ +0.003586] bus_remove_driver+0x6d/0xf0 [ +0.003939] pci_unregister_driver+0x2e/0xb0 [ +0.004278] i40e_exit_module+0x10/0x5f0 [i40e] [ +0.004570] __do_sys_delete_module.isra.0+0x197/0x310 [ +0.005153] do_syscall_64+0x85/0x170 [ +0.003684] ? syscall_exit_to_user_mode+0x69/0x220 [ +0.004886] ? do_syscall_64+0x95/0x170 [ +0.003851] ? exc_page_fault+0x7e/0x180 [ +0.003932] entry_SYSCALL_64_after_hwframe+0x71/0x79 [ +0.005064] RIP: 0033:0x7f59dc9347cb [ +0.003648] Code: 73 01 c3 48 8b 0d 65 16 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 35 16 0c 00 f7 d8 64 89 01 48 [ +0.018753] RSP: 002b:00007ffffac99048 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 [ +0.007577] RAX: ffffffffffffffda RBX: 0000559b9bb2f6e0 RCX: 00007f59dc9347cb [ +0.007140] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 0000559b9bb2f748 [ +0.007146] RBP: 00007ffffac99070 R08: 1999999999999999 R09: 0000000000000000 [ +0.007133] R10: 00007f59dc9a5ac0 R11: 0000000000000206 R12: 0000000000000000 [ +0.007141] R13: 00007ffffac992d8 R14: 0000559b9bb2f6e0 R15: 0000000000000000 [ +0.007151] </TASK> [ +0.002204] ---[ end trace 0000000000000000 ]--- Fix this by checking if the XDP program is being loaded or unloaded. Then, block only loading a new program while "__I40E_IN_REMOVE" is set. Also, move testing "__I40E_IN_REMOVE" flag to the beginning of XDP_SETUP callback to avoid unnecessary operations and checks. Fixes: 6533e558c650 ("i40e: Fix reset path while removing the driver") Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20240708230750.625986-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-10	tracing/kprobes: Fix build error when find_module() is not available	Masami Hiramatsu (Google)
	The kernel test robot reported that the find_module() is not available if CONFIG_MODULES=n. Fix this error by hiding find_modules() in #ifdef CONFIG_MODULES with related rcu locks as try_module_get_by_name(). Link: https://lore.kernel.org/all/172056819167.201571.250053007194508038.stgit@devnote2/ Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202407070744.RcLkn8sq-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202407070917.VVUCBlaS-lkp@intel.com/ Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-07-09	Merge branch 'selftests-drv-net-rss_ctx-more-tests'	Jakub Kicinski
	Jakub Kicinski says: ==================== selftests: drv-net: rss_ctx: more tests Add a few more tests for RSS. v1: https://lore.kernel.org/all/20240705015725.680275-1-kuba@kernel.org/ ==================== Link: https://patch.msgid.link/20240708213627.226025-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09	selftests: drv-net: rss_ctx: test flow rehashing without impacting traffic	Jakub Kicinski
	Some workloads may want to rehash the flows in response to an imbalance. Most effective way to do that is changing the RSS key. Check that changing the key does not cause link flaps or traffic disruption. Disrupting traffic for key update is not incorrect, but makes the key update unusable for rehashing under load. Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20240708213627.226025-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09	selftests: drv-net: rss_ctx: check behavior of indirection table resizing	Jakub Kicinski
	Some devices dynamically increase and decrease the size of the RSS indirection table based on the number of enabled queues. When that happens driver must maintain the balance of entries (preferably duplicating the smaller table). Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20240708213627.226025-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09	selftests: drv-net: rss_ctx: test queue changes vs user RSS config	Jakub Kicinski
	By default main RSS table should change to include all queues. When user sets a specific RSS config the driver should preserve it, even when queue count changes. Driver should refuse to deactivate queues used in the user-set RSS config. For additional contexts driver should still refuse to deactivate queues in use. Whether the contexts should get resized like context 0 when queue count increases is a bit unclear. I anticipate most drivers today don't do that. Since main use case for additional contexts is to set the indir table - it doesn't seem worthwhile to care about behavior of the default table too much. Don't test that. Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20240708213627.226025-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09	selftests: drv-net: rss_ctx: factor out send traffic and check	Jakub Kicinski
	Wrap up sending traffic and checking in which queues it landed in a helper. The method used for testing is to send a lot of iperf traffic and check which queues received the most packets. Those should be the queues where we expect iperf to land - either because we installed a filter for the port iperf uses, or we didn't and expect it to use context 0. Contexts get disjoint queue sets, but the main context (AKA context 0) may receive some background traffic (noise). Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20240708213627.226025-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09	selftests: drv-net: rss_ctx: fix cleanup in the basic test	Jakub Kicinski
	The basic test may fail without resetting the RSS indir table. Use the .exec() method to run cleanup early since we re-test with traffic that returning to default state works. While at it reformat the doc a tiny bit. Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20240708213627.226025-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-09	PCI: dw-rockchip: Use pci_epc_init_notify() directly	Niklas Cassel
	A previous commit ("PCI: dwc: ep: Remove dw_pcie_ep_init_notify() wrapper") removed the dw_pcie_ep_init_notify() wrapper and changed the DWC glue drivers to instead use pci_epc_init_notify() directly. The endpoint support for the dw-rockchip had not been merged at that point in time, so the previous commit wrapper") did not update dw-rockchip. Do the same change for dw-rockchip, so that the driver will not try to use a function that has now been removed. Link: https://lore.kernel.org/linux-pci/20240622132024.2927799-2-cassel@kernel.org Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-07-09	PCI: dw-rockchip: Add endpoint mode support	Niklas Cassel
	The PCIe controller in rk3568 and rk3588 can operate in endpoint mode. This endpoint mode support heavily leverages the existing code in pcie-designware-ep.c. Add support for endpoint mode to the existing pcie-dw-rockchip glue driver. [kwilczynski: squash with patch adding the PCI_ENDPOINT dependency] Link: https://lore.kernel.org/linux-pci/20240607-rockchip-pcie-ep-v1-v5-10-0a042d6b0049@kernel.org Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-07-09	PCI: dw-rockchip: Refactor the driver to prepare for EP mode	Niklas Cassel
	Refactor the driver to prepare for EP mode. Add of-match data to the existing compatible, and explicitly define it as DW_PCIE_RC_TYPE. This way, we will be able to add EP mode in a follow-up commit in a much less intrusive way, which makes the follow-up commit much easier to review. No functional change intended. Link: https://lore.kernel.org/linux-pci/20240607-rockchip-pcie-ep-v1-v5-9-0a042d6b0049@kernel.org Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-07-09	PCI: dw-rockchip: Add rockchip_pcie_get_ltssm() helper	Niklas Cassel
	Add a rockchip_pcie_ltssm() helper function that reads the LTSSM status. This helper will be used in additional places in follow-up commits. Link: https://lore.kernel.org/linux-pci/20240607-rockchip-pcie-ep-v1-v5-8-0a042d6b0049@kernel.org Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>