linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2022-01-08	kbuild: do not quote string values in include/config/auto.conf	Masahiro Yamada
	The previous commit fixed up all shell scripts to not include include/config/auto.conf. Now that include/config/auto.conf is only included by Makefiles, we can change it into a more Make-friendly form. Previously, Kconfig output string values enclosed with double-quotes (both in the .config and include/config/auto.conf): CONFIG_X="foo bar" Unlike shell, Make handles double-quotes (and single-quotes as well) verbatim. We must rip them off when used. There are some patterns: [1] $(patsubst "%",%,$(CONFIG_X)) [2] $(CONFIG_X:"%"=%) [3] $(subst ",,$(CONFIG_X)) [4] $(shell echo $(CONFIG_X)) These are not only ugly, but also fragile. [1] and [2] do not work if the value contains spaces, like CONFIG_X=" foo bar " [3] does not work correctly if the value contains double-quotes like CONFIG_X="foo\"bar" [4] seems to work better, but has a cost of forking a process. Anyway, quoted strings were always PITA for our Makefiles. This commit changes Kconfig to stop quoting in include/config/auto.conf. These are the string type symbols referenced in Makefiles or scripts: ACPI_CUSTOM_DSDT_FILE ARC_BUILTIN_DTB_NAME ARC_TUNE_MCPU BUILTIN_DTB_SOURCE CC_IMPLICIT_FALLTHROUGH CC_VERSION_TEXT CFG80211_EXTRA_REGDB_KEYDIR EXTRA_FIRMWARE EXTRA_FIRMWARE_DIR EXTRA_TARGETS H8300_BUILTIN_DTB INITRAMFS_SOURCE LOCALVERSION MODULE_SIG_HASH MODULE_SIG_KEY NDS32_BUILTIN_DTB NIOS2_DTB_SOURCE OPENRISC_BUILTIN_DTB SOC_CANAAN_K210_DTB_SOURCE SYSTEM_BLACKLIST_HASH_LIST SYSTEM_REVOCATION_KEYS SYSTEM_TRUSTED_KEYS TARGET_CPU UNUSED_KSYMS_WHITELIST XILINX_MICROBLAZE0_FAMILY XILINX_MICROBLAZE0_HW_VER XTENSA_VARIANT_NAME I checked them one by one, and fixed up the code where necessary. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-01-08	kbuild: do not include include/config/auto.conf from shell scripts	Masahiro Yamada
	Richard Weinberger pointed out the risk of sourcing the kernel config from shell scripts [1], and proposed some patches [2], [3]. It is a good point, but it took a long time because I was wondering how to fix this. This commit goes with simple grep approach because there are only a few scripts including the kernel configuration. scripts/link_vmlinux.sh has references to a bunch of CONFIG options, all of which are boolean. I added is_enabled() helper as scripts/package/{mkdebian,builddeb} do. scripts/gen_autoksyms.sh uses 'eval', stating "to expand the whitelist path". I removed it since it is the issue we are trying to fix. I was a bit worried about the cost of invoking the grep command over again. I extracted the grep parts from it, and measured the cost. It was approximately 0.03 sec, which I hope is acceptable. [test code] $ cat test-grep.sh #!/bin/sh is_enabled() { grep -q "^$1=y" include/config/auto.conf } is_enabled CONFIG_LTO_CLANG is_enabled CONFIG_LTO_CLANG is_enabled CONFIG_STACK_VALIDATION is_enabled CONFIG_UNWINDER_ORC is_enabled CONFIG_FTRACE_MCOUNT_USE_OBJTOOL is_enabled CONFIG_VMLINUX_VALIDATION is_enabled CONFIG_FRAME_POINTER is_enabled CONFIG_GCOV_KERNEL is_enabled CONFIG_LTO_CLANG is_enabled CONFIG_RETPOLINE is_enabled CONFIG_X86_SMAP is_enabled CONFIG_LTO_CLANG is_enabled CONFIG_VMLINUX_MAP is_enabled CONFIG_KALLSYMS_ALL is_enabled CONFIG_KALLSYMS_ABSOLUTE_PERCPU is_enabled CONFIG_KALLSYMS_BASE_RELATIVE is_enabled CONFIG_DEBUG_INFO_BTF is_enabled CONFIG_KALLSYMS is_enabled CONFIG_DEBUG_INFO_BTF is_enabled CONFIG_BPF is_enabled CONFIG_BUILDTIME_TABLE_SORT is_enabled CONFIG_KALLSYMS $ time ./test-grep.sh real 0m0.036s user 0m0.027s sys m0.009s [1]: https://lore.kernel.org/all/1919455.eZKeABUfgV@blindfold/ [2]: https://lore.kernel.org/all/20180219092245.26404-1-richard@nod.at/ [3]: https://lore.kernel.org/all/20210920213957.1064-2-richard@nod.at/ Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <n.schier@avm.de>
2022-01-08	certs: simplify $(srctree)/ handling and remove config_filename macro	Masahiro Yamada
	The complex macro, config_filename, was introduced to do: [1] drop double-quotes from the string value [2] add $(srctree)/ prefix in case the file is not found in $(objtree) [3] escape spaces and more [1] will be more generally handled by Kconfig later. As for [2], Kbuild uses VPATH to search for files in $(objtree), $(srctree) in this order. GNU Make can natively handle it. As for [3], converting $(space) to $(space_escape) back and forth looks questionable to me. It is well-known that GNU Make cannot handle file paths with spaces in the first place. Instead of using the complex macro, use $< so it will be expanded to the file path of the key. Remove config_filename, finally. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-01-08	kbuild: stop using config_filename in scripts/Makefile.modsign	Masahiro Yamada
	Toward the goal of removing the config_filename macro, drop the double-quotes and add $(srctree)/ prefix in an ad hoc way. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <n.schier@avm.de>
2022-01-08	certs: remove misleading comments about GCC PR	Masahiro Yamada
	This dependency is necessary irrespective of the mentioned GCC PR because the embedded certificates are build artifacts and must be generated by extract_certs before *.S files are compiled. The comment sounds like we are hoping to remove these dependencies someday. No, we cannot remove them. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-01-08	certs: refactor file cleaning	Masahiro Yamada
	'make clean' removes files listed in 'targets'. It is redundant to specify both 'targets' and 'clean-files'. Move 'targets' assignments out of the ifeq-conditionals so scripts/Makefile.clean can see them. One effective change is that certs/certs/signing_key.x509 is now deleted by 'make clean' instead of 'make mrproper. This certificate is embedded in the kernel. It is not used in any way by external module builds. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <n.schier@avm.de>
2022-01-08	certs: remove unneeded -I$(srctree) option for system_certificates.o	Masahiro Yamada
	The .incbin directive in certs/system_certificates.S includes certs/signing_key.x509 and certs/x509_certificate_list, both of which are generated by extract_certs, i.e. exist in $(objtree). This option -I$(srctree) is unneeded. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-01-08	certs: unify duplicated cmd_extract_certs and improve the log	Masahiro Yamada
	cmd_extract_certs is defined twice. Unify them. The current log shows the input file $(2), which might be empty. You cannot know what is being created from the log, "EXTRACT_CERTS". Change the log to show the output file with better alignment. [Before] EXTRACT_CERTS certs/signing_key.pem CC certs/system_keyring.o EXTRACT_CERTS AS certs/system_certificates.o CC certs/common.o CC certs/blacklist.o EXTRACT_CERTS AS certs/revocation_certificates.o [After] CERT certs/signing_key.x509 CC certs/system_keyring.o CERT certs/x509_certificate_list AS certs/system_certificates.o CC certs/common.o CC certs/blacklist.o CERT certs/x509_revocation_list AS certs/revocation_certificates.o Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <n.schier@avm.de>
2022-01-08	certs: use $< and $@ to simplify the key generation rule	Masahiro Yamada
	Do not repeat $(obj)/x509.genkey or $(obj)/signing_key.pem Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <n.schier@avm.de>
2022-01-08	kbuild: remove headers_check stub	Masahiro Yamada
	Linux 5.15 is out. Remove this stub now. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-01-08	kbuild: move headers_check.pl to usr/include/	Masahiro Yamada
	This script is only used by usr/include/Makefile. Make it local to the directory. Update the comment in include/uapi/linux/soundcard.h because 'make headers_check' is no longer functional. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-01-08	ALSA: hda: Fix dependency on ASoC cs35l41 codec	Takashi Iwai
	The recently added support for CS35L41 codec unconditionally selects CONFIG_SND_SOC_CS35L41_LIB, but this can't work unless the top-level CONFIG_SND_SOC is enabled. This patch adds the proper dependency. Fixes: 7b2f3eb492da ("ALSA: hda: cs35l41: Add support for CS35L41 in HDA systems") Link: https://lore.kernel.org/r/20220107092647.20258-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-01-08	mm: Use multi-index entries in the page cache	Matthew Wilcox (Oracle)
	We currently store large folios as 2^N consecutive entries. While this consumes rather more memory than necessary, it also turns out to be buggy. A writeback operation which starts within a tail page of a dirty folio will not write back the folio as the xarray's dirty bit is only set on the head index. With multi-index entries, the dirty bit will be found no matter where in the folio the operation starts. This does end up simplifying the page cache slightly, although not as much as I had hoped. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: William Kucharski <william.kucharski@oracle.com>
2022-01-08	XArray: Add xas_advance()	Matthew Wilcox (Oracle)
	Add a new helper function to help iterate over multi-index entries. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: William Kucharski <william.kucharski@oracle.com>
2022-01-08	truncate,shmem: Handle truncates that split large folios	Matthew Wilcox (Oracle)
	Handle folio splitting in the parts of the truncation functions which already handle partial pages. Factor all that code out into a new function called truncate_inode_partial_folio(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: William Kucharski <william.kucharski@oracle.com>
2022-01-08	truncate: Convert invalidate_inode_pages2_range to folios	Matthew Wilcox (Oracle)
	If we're going to unmap a folio, we have to be sure to unmap the entire folio, not just the part of it which lies after the search index. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: William Kucharski <william.kucharski@oracle.com>
2022-01-08	fs: Convert vfs_dedupe_file_range_compare to folios	Matthew Wilcox (Oracle)
	We still only operate on a single page of data at a time due to using kmap(). A more complex implementation would work on each page in a folio, but it's not clear that such a complex implementation would be worthwhile. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: William Kucharski <william.kucharski@oracle.com>
2022-01-08	mm: Remove pagevec_remove_exceptionals()	Matthew Wilcox (Oracle)
	All of its callers now call folio_batch_remove_exceptionals(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: William Kucharski <william.kucharski@oracle.com>
2022-01-08	mm: Convert find_lock_entries() to use a folio_batch	Matthew Wilcox (Oracle)
	find_lock_entries() already only returned the head page of folios, so convert it to return a folio_batch instead of a pagevec. That cascades through converting truncate_inode_pages_range() to delete_from_page_cache_batch() and page_cache_delete_batch(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: William Kucharski <william.kucharski@oracle.com>
2022-01-08	filemap: Return only folios from find_get_entries()	Matthew Wilcox (Oracle)
	The callers have all been converted to work on folios, so convert find_get_entries() to return a batch of folios instead of pages. We also now return multiple large folios in a single call. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2022-01-08	filemap: Convert filemap_get_read_batch() to use a folio_batch	Matthew Wilcox (Oracle)
	This change ripples all the way through the filemap_read() call chain and removes a lot of messing about converting folios to pages and back again. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: William Kucharski <william.kucharski@oracle.com>
2022-01-08	filemap: Convert filemap_read() to use a folio	Matthew Wilcox (Oracle)
	We know the pagevec always contains folios, but use page_folio() anyway instead of casting. Removes a few calls to legacy functions. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: William Kucharski <william.kucharski@oracle.com>
2022-01-08	truncate: Add invalidate_complete_folio2()	Matthew Wilcox (Oracle)
	Convert invalidate_complete_page2() to invalidate_complete_folio2(). Use filemap_free_folio() to free the page instead of calling ->freepage manually. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: William Kucharski <william.kucharski@oracle.com>
2022-01-08	truncate: Convert invalidate_inode_pages2_range() to use a folio	Matthew Wilcox (Oracle)
	If we're going to unmap a folio, we have to be sure to unmap the entire folio, not just the part of it which lies after the search index. We cannot yet remove the struct page from invalidate_inode_pages2_range() because the page pointer in the pvec might be a shadow/dax/swap entry instead of actually a page. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: William Kucharski <william.kucharski@oracle.com>
2022-01-08	truncate: Skip known-truncated indices	Matthew Wilcox (Oracle)
	If we've truncated an entire folio, we can skip over all the indices covered by this folio. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: William Kucharski <william.kucharski@oracle.com>
2022-01-08	truncate,shmem: Add truncate_inode_folio()	Matthew Wilcox (Oracle)
	Convert all callers of truncate_inode_page() to call truncate_inode_folio() instead, and move the declaration to mm/internal.h. Move the assertion that the caller is not passing in a tail page to generic_error_remove_page(). We can't entirely remove the struct page from the callers yet because the page pointer in the pvec might be a shadow/dax/swap entry instead of actually a page. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: William Kucharski <william.kucharski@oracle.com>
2022-01-08	shmem: Convert part of shmem_undo_range() to use a folio	Matthew Wilcox (Oracle)
	find_lock_entries() never returns tail pages. We cannot use page_folio() here as the pagevec may also contain swap entries, so simply cast for now. This is an intermediate step which will be fully removed by the end of this series. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: William Kucharski <william.kucharski@oracle.com>
2022-01-08	mm: Add unmap_mapping_folio()	Matthew Wilcox (Oracle)
	Convert both callers of unmap_mapping_page() to call unmap_mapping_folio() instead. Also move zap_details from linux/mm.h to mm/memory.c Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: William Kucharski <william.kucharski@oracle.com>
2022-01-07	Merge branch 'ena-capabilities-field-and-cosmetic-changes'	Jakub Kicinski
	Arthur Kiyanovski says: ==================== ENA: capabilities field and cosmetic changes Add a new capabilities bitmask field to get indication of capabilities supported by the device. Use the capabilities field to query the device for ENI stats support. Other patches are cosmetic changes like fixing readme mistakes, removing unused variables etc... ==================== Link: https://lore.kernel.org/r/20220107202346.3522-1-akiyano@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07	net: ena: Extract recurring driver reset code into a function	Arthur Kiyanovski
	Create an inline function for resetting the driver to reduce code duplication. Signed-off-by: Nati Koler <nkoler@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07	net: ena: Change the name of bad_csum variable	Arthur Kiyanovski
	Changed bad_csum to csum_bad to align with csum_unchecked & csum_good Signed-off-by: Nati Koler <nkoler@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07	net: ena: Add debug prints for invalid req_id resets	Arthur Kiyanovski
	Add qid and req_id to error prints when ENA_REGS_RESET_INV_TX_REQ_ID reset occurs. Switch from %hu to %u, since u16 should be printed with %u, as explained in [1]. [1] - https://www.kernel.org/doc/html/latest/core-api/printk-formats.html Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07	net: ena: Remove ena_calc_queue_size_ctx struct	Arthur Kiyanovski
	This struct was used to pass data from callee function to its caller. Its usage can be avoided. Removing it results in less code without any damage to code readability. Also it allows to consolidate ring size calculation into a single function (ena_calc_io_queue_size()). Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07	net: ena: Move reset completion print to the reset function	Arthur Kiyanovski
	The print that indicates that device reset has finished is currently called from ena_restore_device(). Move it to ena_fw_reset_device() as it is the more natural location for it. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07	net: ena: Remove redundant return code check	Arthur Kiyanovski
	The ena_com_indirect_table_fill_entry() function only returns -EINVAL or 0, no need to check for -EOPNOTSUPP. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07	net: ena: Update LLQ header length in ena documentation	Arthur Kiyanovski
	LLQ entry length is 128 bytes. Therefore the maximum header in the entry is calculated by: tx_max_header_size = LLQ_ENTRY_SIZE - DESCRIPTORS_NUM_BEFORE_HEADER * 16 = 128 - 2 * 16 = 96 This patch updates the documentation so that it states the correct max header length. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07	net: ena: Change ENI stats support check to use capabilities field	Arthur Kiyanovski
	Use the capabilities field to query the device for ENI stats support. This replaces the previous method that tried to get the ENI stats during ena_probe() and used the success or failure as an indication for support by the device. Remove eni_stats_supported field from struct ena_adapter. This field was used for the previous method of queriying for ENI stats support. Change the severity level of the print in case of ena_com_get_eni_stats() failure from info to error. With the previous method of querying form ENI stats support, failure to get ENI stats was normal for devices that don't support it. With the use of the capabilities field such a failure is unexpected, as it is called only if the device reported that it supports ENI stats. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07	net: ena: Add capabilities field with support for ENI stats capability	Arthur Kiyanovski
	This bitmask field indicates what capabilities are supported by the device. The capabilities field differs from the 'supported_features' field which indicates what sub-commands for the set/get feature commands are supported. The sub-commands are specified in the 'feature_id' field of the 'ena_admin_set_feat_cmd' struct in the following way: struct ena_admin_set_feat_cmd cmd; cmd.aq_common_descriptor.opcode = ENA_ADMIN_SET_FEATURE; cmd.feat_common.feature_ The 'capabilities' field, on the other hand, specifies different capabilities of the device. For example, whether the device supports querying of ENI stats. Also add an enumerator which contains all the capabilities. The first added capability macro is for ENI stats feature. Capabilities are queried along with the other device attributes (in ena_com_get_dev_attr_feat()) during device initialization and are stored in the ena_com_dev struct. They can be later queried using the ena_com_get_cap() helper function. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07	net: ena: Change return value of ena_calc_io_queue_size() to void	Arthur Kiyanovski
	ena_calc_io_queue_size() always returns 0, therefore make it a void function and update the calling function to stop checking the return value. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07	af_packet: fix tracking issues in packet_do_bind()	Eric Dumazet
	It appears that my changes in packet_do_bind() were slightly wrong. syzbot found that calling bind() twice would trigger a false positive. Remove proto_curr/dev_curr variables and rewrite things to be less confusing (like not having to use netdev_tracker_alloc(), and instead use the standard dev_hold_track()) Fixes: f1d9268e0618 ("net: add net device refcount tracker to struct packet_type") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Link: https://lore.kernel.org/r/20220107183953.3886647-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07	octeontx2-af: Fix interrupt name strings	Sunil Goutham
	Fixed interrupt name string logic which currently results in wrong memory location being accessed while dumping /proc/interrupts. Fixes: 4826090719d4 ("octeontx2-af: Enable CPT HW interrupts") Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://lore.kernel.org/r/1641538505-28367-1-git-send-email-sbhatta@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07	Merge branch 'mptcp-refactoring-for-one-selftest-and-csum-validation'	Jakub Kicinski
	Mat Martineau says: ==================== mptcp: Refactoring for one selftest and csum validation Patch 1 changes the MPTCP join self tests to depend more on events rather than delays, so the script runs faster and has more consistent results. Patches 2 and 3 get rid of some duplicate code in MPTCP's checksum validation by modifying and leveraging an existing helper function. ==================== Link: https://lore.kernel.org/r/20220107192524.445137-1-mathew.j.martineau@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07	mptcp: reuse __mptcp_make_csum in validate_data_csum	Geliang Tang
	This patch reused __mptcp_make_csum() in validate_data_csum() instead of open-coding. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07	mptcp: change the parameter of __mptcp_make_csum	Geliang Tang
	This patch changed the type of the last parameter of __mptcp_make_csum() from __sum16 to __wsum. And export this function in protocol.h. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07	selftests: mptcp: more stable join tests-cases	Paolo Abeni
	MPTCP join self-tests are a bit fragile as they reply on delays instead of events to catch-up with the expected sockets states. Replace the delay with state checking where possible and reduce the number of sleeps in the most complex scenarios. This will both reduce the tests run-time and will improve stability. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07	net: dsa: felix: add port fast age support	Vladimir Oltean
	Add support for flushing the MAC table on a given port in the ocelot switch library, and use this functionality in the felix DSA driver. This operation is needed when a port leaves a bridge to become standalone, and when the learning is disabled, and when the STP state changes to a state where no FDB entry should be present. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220107144229.244584-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07	net: mscc: ocelot: fix incorrect balancing with down LAG ports	Vladimir Oltean
	Assuming the test setup described here: https://patchwork.kernel.org/project/netdevbpf/cover/20210205130240.4072854-1-vladimir.oltean@nxp.com/ (swp1 and swp2 are in bond0, and bond0 is in a bridge with swp0) it can be seen that when swp1 goes down (on either board A or B), then traffic that should go through that port isn't forwarded anywhere. A dump of the PGID table shows the following: PGID_DST[0] = ports 0 PGID_DST[1] = ports 1 PGID_DST[2] = ports 2 PGID_DST[3] = ports 3 PGID_DST[4] = ports 4 PGID_DST[5] = ports 5 PGID_DST[6] = no ports PGID_AGGR[0] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[1] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[2] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[3] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[4] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[5] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[6] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[7] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[8] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[9] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[10] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[11] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[12] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[13] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[14] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[15] = ports 0, 1, 2, 3, 4, 5 PGID_SRC[0] = ports 1, 2 PGID_SRC[1] = ports 0 PGID_SRC[2] = ports 0 PGID_SRC[3] = no ports PGID_SRC[4] = no ports PGID_SRC[5] = no ports PGID_SRC[6] = ports 0, 1, 2, 3, 4, 5 Whereas a "good" PGID configuration for that setup should have looked like this: PGID_DST[0] = ports 0 PGID_DST[1] = ports 1, 2 PGID_DST[2] = ports 1, 2 PGID_DST[3] = ports 3 PGID_DST[4] = ports 4 PGID_DST[5] = ports 5 PGID_DST[6] = no ports PGID_AGGR[0] = ports 0, 2, 3, 4, 5 PGID_AGGR[1] = ports 0, 2, 3, 4, 5 PGID_AGGR[2] = ports 0, 2, 3, 4, 5 PGID_AGGR[3] = ports 0, 2, 3, 4, 5 PGID_AGGR[4] = ports 0, 2, 3, 4, 5 PGID_AGGR[5] = ports 0, 2, 3, 4, 5 PGID_AGGR[6] = ports 0, 2, 3, 4, 5 PGID_AGGR[7] = ports 0, 2, 3, 4, 5 PGID_AGGR[8] = ports 0, 2, 3, 4, 5 PGID_AGGR[9] = ports 0, 2, 3, 4, 5 PGID_AGGR[10] = ports 0, 2, 3, 4, 5 PGID_AGGR[11] = ports 0, 2, 3, 4, 5 PGID_AGGR[12] = ports 0, 2, 3, 4, 5 PGID_AGGR[13] = ports 0, 2, 3, 4, 5 PGID_AGGR[14] = ports 0, 2, 3, 4, 5 PGID_AGGR[15] = ports 0, 2, 3, 4, 5 PGID_SRC[0] = ports 1, 2 PGID_SRC[1] = ports 0 PGID_SRC[2] = ports 0 PGID_SRC[3] = no ports PGID_SRC[4] = no ports PGID_SRC[5] = no ports PGID_SRC[6] = ports 0, 1, 2, 3, 4, 5 In other words, in the "bad" configuration, the attempt is to remove the inactive swp1 from the destination ports via PGID_DST. But when a MAC table entry is learned, it is learned towards PGID_DST 1, because that is the logical port id of the LAG itself (it is equal to the lowest numbered member port). So when swp1 becomes inactive, if we set PGID_DST[1] to contain just swp1 and not swp2, the packet will not have any chance to reach the destination via swp2. The "correct" way to remove swp1 as a destination is via PGID_AGGR (remove swp1 from the aggregation port groups for all aggregation codes). This means that PGID_DST[1] and PGID_DST[2] must still contain both swp1 and swp2. This makes the MAC table still treat packets destined towards the single-port LAG as "multicast", and the inactive ports are removed via the aggregation code tables. The change presented here is a design one: the ocelot_get_bond_mask() function used to take an "only_active_ports" argument. We don't need that. The only call site that specifies only_active_ports=true, ocelot_set_aggr_pgids(), must retrieve the entire bonding mask, because it must program that into PGID_DST. Additionally, it must also clear the inactive ports from the bond mask here, which it can't do if bond_mask just contains the active ports: ac = ocelot_read_rix(ocelot, ANA_PGID_PGID, i); ac &= ~bond_mask; <---- here /* Don't do division by zero if there was no active * port. Just make all aggregation codes zero. */ if (num_active_ports) ac \|= BIT(aggr_idx[i % num_active_ports]); ocelot_write_rix(ocelot, ac, ANA_PGID_PGID, i); So it becomes the responsibility of ocelot_set_aggr_pgids() to take ocelot_port->lag_tx_active into consideration when populating the aggr_idx array. Fixes: 23ca3b727ee6 ("net: mscc: ocelot: rebalance LAGs on link up/down events") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220107164332.402133-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07	Merge branch '40GbE' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 40GbE Intel Wired LAN Driver Updates 2022-01-07 This series contains updates to i40e and iavf drivers. Karen limits per VF MAC filters so that one VF does not consume all filters for i40e. Jedrzej reduces busy wait time for admin queue calls for i40e. Mateusz updates firmware versions to reflect new supported NVM images and renames an error to remove non-inclusive language for i40e. Yang Li fixes a set but not used warning for i40e. Jason Wang removes an unneeded variable for iavf. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: iavf: remove an unneeded variable i40e: remove variables set but not used i40e: Remove non-inclusive language i40e: Update FW API version i40e: Minimize amount of busy-waiting during AQ send i40e: Add ensurance of MacVlan resources for every trusted VF ==================== Link: https://lore.kernel.org/r/20220107175704.438387-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07	PCI: Correct misspelled words	Krzysztof Wilczyński
	Fix a number of misspelled words, and while at it, correct two phrases used to indicate a status of an operation where words used have been cleverly truncated and thus always trigger a spellchecking error while performing a static code analysis over the PCI tree. [bhelgaas: reverse sense of quirk ternary] Link: https://lore.kernel.org/r/20220107225942.121484-1-kw@linux.com Signed-off-by: Krzysztof Wilczyński <kw@linux.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2022-01-07	net/tls: Fix skb memory leak when running kTLS traffic	Gal Pressman
	The cited Fixes commit introduced a memory leak when running kTLS traffic (with/without hardware offloads). I'm running nginx on the server side and wrk on the client side and get the following: unreferenced object 0xffff8881935e9b80 (size 224): comm "softirq", pid 0, jiffies 4294903611 (age 43.204s) hex dump (first 32 bytes): 80 9b d0 36 81 88 ff ff 00 00 00 00 00 00 00 00 ...6............ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000efe2a999>] build_skb+0x1f/0x170 [<00000000ef521785>] mlx5e_skb_from_cqe_mpwrq_linear+0x2bc/0x610 [mlx5_core] [<00000000945d0ffe>] mlx5e_handle_rx_cqe_mpwrq+0x264/0x9e0 [mlx5_core] [<00000000cb675b06>] mlx5e_poll_rx_cq+0x3ad/0x17a0 [mlx5_core] [<0000000018aac6a9>] mlx5e_napi_poll+0x28c/0x1b60 [mlx5_core] [<000000001f3369d1>] __napi_poll+0x9f/0x560 [<00000000cfa11f72>] net_rx_action+0x357/0xa60 [<000000008653b8d7>] __do_softirq+0x282/0x94e [<00000000644923c6>] __irq_exit_rcu+0x11f/0x170 [<00000000d4085f8f>] irq_exit_rcu+0xa/0x20 [<00000000d412fef4>] common_interrupt+0x7d/0xa0 [<00000000bfb0cebc>] asm_common_interrupt+0x1e/0x40 [<00000000d80d0890>] default_idle+0x53/0x70 [<00000000f2b9780e>] default_idle_call+0x8c/0xd0 [<00000000c7659e15>] do_idle+0x394/0x450 I'm not familiar with these areas of the code, but I've added this sk_defer_free_flush() to tls_sw_recvmsg() based on a hunch and it resolved the issue. Fixes: f35f821935d8 ("tcp: defer skb freeing after socket lock is released") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220102081253.9123-1-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>