linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2022-09-30	Fix PM disable depth imbalance in probe	Mark Brown
	Merge series from Zhang Qilong <zhangqilong3@huawei.com>: The pm_runtime_enable will increase power disable depth. Thus a pairing decrement is needed on the error handling path to keep it balanced according to context. We fix it by moving pm_runtime_enable to the endding of probe. Zhang Qilong (4): ASoC: wm8997: Fix PM disable depth imbalance in wm8997_probe ASoC: wm5110: Fix PM disable depth imbalance in wm5110_probe ASoC: wm5102: Fix PM disable depth imbalance in wm5102_probe ASoC: mt6660: Fix PM disable depth imbalance in mt6660_i2c_probe sound/soc/codecs/mt6660.c \| 8 ++++++-- sound/soc/codecs/wm5102.c \| 6 +++--- sound/soc/codecs/wm5110.c \| 6 +++--- sound/soc/codecs/wm8997.c \| 6 +++--- 4 files changed, 15 insertions(+), 11 deletions(-) -- 2.25.1
2022-09-30	Merge branch 'xfrm: add netlink extack to all the ->init_stat'	Steffen Klassert
	Sabrina Dubroca says: ============ This series completes extack support for state creation. ============ Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-09-29	hardening: Remove Clang's enable flag for -ftrivial-auto-var-init=zero	Kees Cook
	Now that Clang's -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang option is no longer required, remove it from the command line. Clang 16 and later will warn when it is used, which will cause Kconfig to think it can't use -ftrivial-auto-var-init=zero at all. Check for whether it is required and only use it when so. Cc: Nathan Chancellor <nathan@kernel.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: linux-kbuild@vger.kernel.org Cc: llvm@lists.linux.dev Cc: stable@vger.kernel.org Fixes: f02003c860d9 ("hardening: Avoid harmless Clang option under CONFIG_INIT_STACK_ALL_ZERO") Signed-off-by: Kees Cook <keescook@chromium.org>
2022-09-30	crypto: aspeed - Remove redundant dev_err call	Shang XiaoJing
	devm_ioremap_resource() prints error message in itself. Remove the dev_err call to avoid redundant error message. Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-09-30	crypto: scatterwalk - Remove unused inline function scatterwalk_aligned()	Gaosheng Cui
	The scatterwalk_aligned() are no longer used since removing blkcipher and ablkcipher support, all use of it has been removed since commit d63007eb954e ("crypto: ablkcipher - remove deprecated and unused ablkcipher support"), so remove it. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-09-30	crypto: aead - Remove unused inline functions from aead	Gaosheng Cui
	The aead_enqueue_request, aead_dequeue_request and aead_get_backlog are no longer used since commit 04a4616e6a21 ("crypto: omap-aes-gcm - convert to use crypto engine"), their functinoality has been replaced by crypto engine, so remove them. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-09-30	crypto: bcm - Simplify obtain the name for cipher	Gaosheng Cui
	The crypto_ahash_alg_name(tfm) can obtain the name for cipher in include/crypto/hash.h, but now the function is not in use, so we use it to simplify the code, and optimize the code structure. Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-09-30	crypto: marvell/octeontx - use sysfs_emit() to instead of scnprintf()	ye xingchen
	Replace the open-code with sysfs_emit() to simplify the code. Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-09-30	hwrng: core - start hwrng kthread also for untrusted sources	Dominik Brodowski
	Start the hwrng kthread even if the hwrng source has a quality setting of zero. Then, every crng reseed interval, one batch of data from this zero-quality hwrng source will be mixed into the CRNG pool. This patch is based on the assumption that data from a hwrng source will not actively harm the CRNG state. Instead, many hwrng sources (such as TPM devices), even though they are assigend a quality level of zero, actually provide some entropy, which is good enough to mix into the CRNG pool every once in a while. Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-09-30	crypto: zip - remove the unneeded result variable	ye xingchen
	Return the value directly instead of storing it in another redundant variable. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-09-30	crypto: qat - add limit to linked list parsing	Adam Guerin
	adf_copy_key_value_data() copies data from userland to kernel, based on a linked link provided by userland. If userland provides a circular list (or just a very long one) then it would drive a long loop where allocation occurs in every loop. This could lead to low memory conditions. Adding a limit to stop endless loop. Signed-off-by: Adam Guerin <adam.guerin@intel.com> Co-developed-by: Ciunas Bennett <ciunas.bennett@intel.com> Signed-off-by: Ciunas Bennett <ciunas.bennett@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-09-30	crypto: octeontx2 - Remove the unneeded result variable	ye xingchen
	Return the value otx2_cpt_send_mbox_msg() directly instead of storing it in another redundant variable. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-09-30	crypto: ccp - Remove the unneeded result variable	ye xingchen
	Return the value ccp_crypto_enqueue_request() directly instead of storing it in another redundant variable. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn> Acked-by: John Allen <john.allen@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-09-30	crypto: aspeed - Fix check for platform_get_irq() errors	YueHaibing
	The platform_get_irq() function returns negative on error and positive non-zero values on success. It never returns zero, but if it did then treat that as a success. Also remove redundant dev_err() print as platform_get_irq() already prints an error. Fixes: 108713a713c7 ("crypto: aspeed - Add HACE hash driver") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Neal Liu <neal_liu@aspeedtech.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-09-30	crypto: virtio - fix memory-leak	lei he
	Fix memory-leak for virtio-crypto akcipher request, this problem is introduced by 59ca6c93387d3(virtio-crypto: implement RSA algorithm). The leak can be reproduced and tested with the following script inside virtual machine: #!/bin/bash LOOP_TIMES=10000 # required module: pkcs8_key_parser, virtio_crypto modprobe pkcs8_key_parser # if CONFIG_PKCS8_PRIVATE_KEY_PARSER=m modprobe virtio_crypto # if CONFIG_CRYPTO_DEV_VIRTIO=m rm -rf /tmp/data dd if=/dev/random of=/tmp/data count=1 bs=230 # generate private key and self-signed cert openssl req -nodes -x509 -newkey rsa:2048 -keyout key.pem \ -outform der -out cert.der \ -subj "/C=CN/ST=GD/L=SZ/O=vihoo/OU=dev/CN=always.com/emailAddress=yy@always.com" # convert private key from pem to der openssl pkcs8 -in key.pem -topk8 -nocrypt -outform DER -out key.der # add key PRIV_KEY_ID=`cat key.der \| keyctl padd asymmetric test_priv_key @s` echo "priv key id = "$PRIV_KEY_ID PUB_KEY_ID=`cat cert.der \| keyctl padd asymmetric test_pub_key @s` echo "pub key id = "$PUB_KEY_ID # query key keyctl pkey_query $PRIV_KEY_ID 0 keyctl pkey_query $PUB_KEY_ID 0 # here we only run pkey_encrypt becasuse it is the fastest interface function bench_pub() { keyctl pkey_encrypt $PUB_KEY_ID 0 /tmp/data enc=pkcs1 >/tmp/enc.pub } # do bench_pub in loop to obtain the memory leak for (( i = 0; i < ${LOOP_TIMES}; ++i )); do bench_pub done Signed-off-by: lei he <helei.sig11@bytedance.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-09-30	crypto: cavium - prevent integer overflow loading firmware	Dan Carpenter
	The "code_length" value comes from the firmware file. If your firmware is untrusted realistically there is probably very little you can do to protect yourself. Still we try to limit the damage as much as possible. Also Smatch marks any data read from the filesystem as untrusted and prints warnings if it not capped correctly. The "ntohl(ucode->code_length) * 2" multiplication can have an integer overflow. Fixes: 9e2c7d99941d ("crypto: cavium - Add Support for Octeon-tx CPT Engine") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-09-30	crypto: marvell/octeontx - prevent integer overflows	Dan Carpenter
	The "code_length" value comes from the firmware file. If your firmware is untrusted realistically there is probably very little you can do to protect yourself. Still we try to limit the damage as much as possible. Also Smatch marks any data read from the filesystem as untrusted and prints warnings if it not capped correctly. The "code_length * 2" can overflow. The round_up(ucode_size, 16) + sizeof() expression can overflow too. Prevent these overflows. Fixes: d9110b0b01ff ("crypto: marvell - add support for OCTEON TX CPT engine") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-09-30	crypto: aspeed - fix build error when only CRYPTO_DEV_ASPEED is enabled	Neal Liu
	Fix build error within the following configs setting: - CONFIG_CRYPTO_DEV_ASPEED=y - CONFIG_CRYPTO_DEV_ASPEED_HACE_HASH is not set - CONFIG_CRYPTO_DEV_ASPEED_HACE_CRYPTO is not set Error messages: make[4]: *** No rule to make target 'drivers/crypto/aspeed/aspeed_crypto.o' , needed by 'drivers/crypto/aspeed/built-in.a'. make[4]: Target '__build' not remade because of errors. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Neal Liu <neal_liu@aspeedtech.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-09-29	fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE	Lukas Czerner
	Currently the I_DIRTY_TIME will never get set if the inode already has I_DIRTY_INODE with assumption that it supersedes I_DIRTY_TIME. That's true, however ext4 will only update the on-disk inode in ->dirty_inode(), not on actual writeback. As a result if the inode already has I_DIRTY_INODE state by the time we get to __mark_inode_dirty() only with I_DIRTY_TIME, the time was already filled into on-disk inode and will not get updated until the next I_DIRTY_INODE update, which might never come if we crash or get a power failure. The problem can be reproduced on ext4 by running xfstest generic/622 with -o iversion mount option. Fix it by allowing I_DIRTY_TIME to be set even if the inode already has I_DIRTY_INODE. Also make sure that the case is properly handled in writeback_single_inode() as well. Additionally changes in xfs_fs_dirty_inode() was made to accommodate for I_DIRTY_TIME in flag. Thanks Jan Kara for suggestions on how to make this work properly. Cc: Dave Chinner <david@fromorbit.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: stable@kernel.org Signed-off-by: Lukas Czerner <lczerner@redhat.com> Suggested-by: Jan Kara <jack@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220825100657.44217-1-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-09-29	ext4: don't increase iversion counter for ea_inodes	Lukas Czerner
	ea_inodes are using i_version for storing part of the reference count so we really need to leave it alone. The problem can be reproduced by xfstest ext4/026 when iversion is enabled. Fix it by not calling inode_inc_iversion() for EXT4_EA_INODE_FL inodes in ext4_mark_iloc_dirty(). Cc: stable@kernel.org Signed-off-by: Lukas Czerner <lczerner@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Link: https://lore.kernel.org/r/20220824160349.39664-1-lczerner@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-09-29	ext4: fix check for block being out of directory size	Jan Kara
	The check in __ext4_read_dirblock() for block being outside of directory size was wrong because it compared block number against directory size in bytes. Fix it. Fixes: 65f8ea4cd57d ("ext4: check if directory block is within i_size") CVE: CVE-2022-1184 CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Link: https://lore.kernel.org/r/20220822114832.1482-1-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-09-29	fs/buffer: make submit_bh & submit_bh_wbc return type as void	Ritesh Harjani (IBM)
	submit_bh/submit_bh_wbc are non-blocking functions which just submit the bio and return. The caller of submit_bh/submit_bh_wbc needs to wait on buffer till I/O completion and then check buffer head's b_state field to know if there was any I/O error. Hence there is no need for these functions to have any return type. Even now they always returns 0. Hence drop the return value and make their return type as void to avoid any confusion. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Link: https://lore.kernel.org/r/cb66ef823374cdd94d2d03083ce13de844fffd41.1660788334.git.ritesh.list@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-09-29	fs/buffer: drop useless return value of submit_bh	Ritesh Harjani (IBM)
	submit_bh always returns 0. This patch drops the useless return value of submit_bh from __sync_dirty_buffer(). Once all of submit_bh callers are cleaned up, we can make it's return type as void. Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/a98a6ddfac68f73d684c2724952e825bc1f4d238.1660788334.git.ritesh.list@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-09-29	fs/ntfs: drop useless return value of submit_bh from ntfs_submit_bh_for_read	Ritesh Harjani (IBM)
	submit_bh always returns 0. This patch drops the useless return value of submit_bh from ntfs_submit_bh_for_read(). Once all of submit_bh callers are cleaned up, we can make it's return type as void. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Ritesh Harjani <ritesh.list@gmail.com> Link: https://lore.kernel.org/r/d82eb29e8dbc52fe13a7affef5c907ea4076aa31.1660788334.git.ritesh.list@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-09-29	jbd2: drop useless return value of submit_bh	Ritesh Harjani (IBM)
	submit_bh always returns 0. This patch cleans up 2 of it's caller in jbd2 to drop submit_bh's useless return value. Once all submit_bh callers are cleaned up, we can make it's return type as void. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Link: https://lore.kernel.org/r/e069c0539be0aec61abcdc6f6141982ec85d489d.1660788334.git.ritesh.list@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-09-29	ext4: make ext4_lazyinit_thread freezable	Lalith Rajendran
	ext4_lazyinit_thread is not set freezable. Hence when the thread calls try_to_freeze it doesn't freeze during suspend and continues to send requests to the storage during suspend, resulting in suspend failures. Cc: stable@kernel.org Signed-off-by: Lalith Rajendran <lalithkraj@google.com> Link: https://lore.kernel.org/r/20220818214049.1519544-1-lalithkraj@google.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2022-09-29	Merge branch '100GbE' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-09-28 (ice) Arkadiusz implements a single pin initialization function, checking feature bits, instead of having separate device functions and updates sub-device IDs for recognizing E810T devices. Martyna adds support for switchdev filters on VLAN priority field. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: Add support for VLAN priority filters in switchdev ice: support features on new E810T variants ice: Merge pin initialization of E810 and E810T adapters ==================== Link: https://lore.kernel.org/r/20220928203217.411078-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	eth: alx: take rtnl_lock on resume	Jakub Kicinski
	Zbynek reports that alx trips an rtnl assertion on resume: RTNL: assertion failed at net/core/dev.c (2891) RIP: 0010:netif_set_real_num_tx_queues+0x1ac/0x1c0 Call Trace: <TASK> __alx_open+0x230/0x570 [alx] alx_resume+0x54/0x80 [alx] ? pci_legacy_resume+0x80/0x80 dpm_run_callback+0x4a/0x150 device_resume+0x8b/0x190 async_resume+0x19/0x30 async_run_entry_fn+0x30/0x130 process_one_work+0x1e5/0x3b0 indeed the driver does not hold rtnl_lock during its internal close and re-open functions during suspend/resume. Note that this is not a huge bug as the driver implements its own locking, and does not implement changing the number of queues, but we need to silence the splat. Fixes: 4a5fe57e7751 ("alx: use fine-grained locking instead of RTNL") Reported-and-tested-by: Zbynek Michl <zbynek.michl@gmail.com> Reviewed-by: Niels Dossche <dossche.niels@gmail.com> Link: https://lore.kernel.org/r/20220928181236.1053043-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	sparc: Unbreak the build	Bart Van Assche
	Fix the following build errors: arch/sparc/mm/srmmu.c: In function ‘smp_flush_page_for_dma’: arch/sparc/mm/srmmu.c:1639:13: error: cast between incompatible function types from ‘void ()(long unsigned int)’ to ‘void ()(long unsigned int, long unsigned int, long unsigned int, long unsigned int, long unsigned int)’ [-Werror=cast-function-type] 1639 \| xc1((smpfunc_t) local_ops->page_for_dma, page); \| ^ arch/sparc/mm/srmmu.c: In function ‘smp_flush_cache_mm’: arch/sparc/mm/srmmu.c:1662:29: error: cast between incompatible function types from ‘void ()(struct mm_struct )’ to ‘void (*)(long unsigned int, long unsigned int, long unsigned int, long unsigned int, long unsigned int)’ [-Werror=cast-function-type] 1662 \| xc1((smpfunc_t) local_ops->cache_mm, (unsigned long) mm); \| [ ... ] Compile-tested only. Fixes: 552a23a0e5d0 ("Makefile: Enable -Wcast-function-type") Cc: stable@vger.kernel.org Signed-off-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220830205854.1918026-1-bvanassche@acm.org
2022-09-29	net: phy: Convert to use sysfs_emit() APIs	Wang Yufen
	Follow the advice of the Documentation/filesystems/sysfs.rst and show() should only use sysfs_emit() or sysfs_emit_at() when formatting the value to be returned to user space. Signed-off-by: Wang Yufen <wangyufen@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/1664364860-29153-1-git-send-email-wangyufen@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	Merge branch 'add-tc-taprio-support-for-queuemaxsdu'	Jakub Kicinski
	Vladimir Oltean says: ==================== Add tc-taprio support for queueMaxSDU The tc-taprio offload mode supported by the Felix DSA driver has limitations surrounding its guard bands. The initial discussion was at: https://lore.kernel.org/netdev/c7618025da6723418c56a54fe4683bd7@walle.cc/ with the latest status being that we now have a vsc9959_tas_guard_bands_update() method which makes a best-guess attempt at how much useful space to reserve for packet scheduling in a taprio interval, and how much to reserve for guard bands. IEEE 802.1Q actually does offer a tunable variable (queueMaxSDU) which can determine the max MTU supported per traffic class. In turn we can determine the size we need for the guard bands, depending on the queueMaxSDU. This way we can make the guard band of small taprio intervals smaller than one full MTU worth of transmission time, if we know that said traffic class will transport only smaller packets. As discussed with Gerhard Engleder, the queueMaxSDU may also be useful in limiting the latency on an endpoint, if some of the TX queues are outside of the control of the Linux driver. https://patchwork.kernel.org/project/netdevbpf/patch/20220914153303.1792444-11-vladimir.oltean@nxp.com/ Allow input of queueMaxSDU through netlink into tc-taprio, offload it to the hardware I have access to (LS1028A), and (implicitly) deny non-default values to everyone else. Kurt Kanzenbach has also kindly tested and shared a patch to offload this to hellcreek. v3 at: https://patchwork.kernel.org/project/netdevbpf/cover/20220927234746.1823648-1-vladimir.oltean@nxp.com/ v2 at: https://patchwork.kernel.org/project/netdevbpf/list/?series=679954&state=* v1 at: https://patchwork.kernel.org/project/netdevbpf/cover/20220914153303.1792444-1-vladimir.oltean@nxp.com/ ==================== Link: https://lore.kernel.org/r/20220928095204.2093716-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	net: enetc: offload per-tc max SDU from tc-taprio	Vladimir Oltean
	The driver currently sets the PTCMSDUR register statically to the max MTU supported by the interface. Keep this logic if tc-taprio is absent or if the max_sdu for a traffic class is 0, and follow the requested max SDU size otherwise. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	net: enetc: use common naming scheme for PTGCR and PTGCAPR registers	Vladimir Oltean
	The Port Time Gating Control Register (PTGCR) and Port Time Gating Capability Register (PTGCAPR) have definitions in the driver which aren't in line with the other registers. Rename these. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	net: enetc: cache accesses to &priv->si->hw	Vladimir Oltean
	The &priv->si->hw construct dereferences 2 pointers and makes lines longer than they need to be, in turn making the code harder to read. Replace &priv->si->hw accesses with a "hw" variable when there are 2 or more accesses within a function that dereference this. This includes loops, since &priv->si->hw is a loop invariant. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	net: dsa: hellcreek: Offload per-tc max SDU from tc-taprio	Kurt Kanzenbach
	Add support for configuring the max SDU per priority and per port. If not specified, keep the default. Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	net: dsa: hellcreek: refactor hellcreek_port_setup_tc() to use switch/case	Vladimir Oltean
	The following patch will need to make this function also respond to TC_QUERY_BASE, so make the processing more structured around the tc_setup_type. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	net: dsa: felix: offload per-tc max SDU from tc-taprio	Vladimir Oltean
	Our current vsc9959_tas_guard_bands_update() algorithm has a limitation imposed by the hardware design. To avoid packet overruns between one gate interval and the next (which would add jitter for scheduled traffic in the next gate), we configure the switch to use guard bands. These are as large as the largest packet which is possible to be transmitted. The problem is that at tc-taprio intervals of sizes comparable to a guard band, there isn't an obvious place in which to split the interval between the useful portion (for scheduling) and the guard band portion (where scheduling is blocked). For example, a 10 us interval at 1Gbps allows 1225 octets to be transmitted. We currently split the interval between the bare minimum of 33 ns useful time (required to schedule a single packet) and the rest as guard band. But 33 ns of useful scheduling time will only allow a single packet to be sent, be that packet 1200 octets in size, or 60 octets in size. It is impossible to send 2 60 octets frames in the 10 us window. Except that if we reduced the guard band (and therefore the maximum allowable SDU size) to 5 us, the useful time for scheduling is now also 5 us, so more packets could be scheduled. The hardware inflexibility of not scheduling according to individual packet lengths must unfortunately propagate to the user, who needs to tune the queueMaxSDU values if he wants to fit more small packets into a 10 us interval, rather than one large packet. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	net/sched: taprio: allow user input of per-tc max SDU	Vladimir Oltean
	IEEE 802.1Q clause 12.29.1.1 "The queueMaxSDUTable structure and data types" and 8.6.8.4 "Enhancements for scheduled traffic" talk about the existence of a per traffic class limitation of maximum frame sizes, with a fallback on the port-based MTU. As far as I am able to understand, the 802.1Q Service Data Unit (SDU) represents the MAC Service Data Unit (MSDU, i.e. L2 payload), excluding any number of prepended VLAN headers which may be otherwise present in the MSDU. Therefore, the queueMaxSDU is directly comparable to the device MTU (1500 means L2 payload sizes are accepted, or frame sizes of 1518 octets, or 1522 plus one VLAN header). Drivers which offload this are directly responsible of translating into other units of measurement. To keep the fast path checks optimized, we keep 2 arrays in the qdisc, one for max_sdu translated into frame length (so that it's comparable to skb->len), and another for offloading and for dumping back to the user. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	net/sched: query offload capabilities through ndo_setup_tc()	Vladimir Oltean
	When adding optional new features to Qdisc offloads, existing drivers must reject the new configuration until they are coded up to act on it. Since modifying all drivers in lockstep with the changes in the Qdisc can create problems of its own, it would be nice if there existed an automatic opt-in mechanism for offloading optional features. Jakub proposes that we multiplex one more kind of call through ndo_setup_tc(): one where the driver populates a Qdisc-specific capability structure. First user will be taprio in further changes. Here we are introducing the definitions for the base functionality. Link: https://patchwork.kernel.org/project/netdevbpf/patch/20220923163310.3192733-3-vladimir.oltean@nxp.com/ Suggested-by: Jakub Kicinski <kuba@kernel.org> Co-developed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	net/tipc: Remove unused struct distr_queue_item	Yuan Can
	After commit 09b5678c778f("tipc: remove dead code in tipc_net and relatives"), struct distr_queue_item is not used any more and can be removed as well. Signed-off-by: Yuan Can <yuancan@huawei.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Link: https://lore.kernel.org/r/20220928085636.71749-1-yuancan@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	net: skb: introduce and use a single page frag cache	Paolo Abeni
	After commit 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs") we are observing 10-20% regressions in performance tests with small packets. The perf trace points to high pressure on the slab allocator. This change tries to improve the allocation schema for small packets using an idea originally suggested by Eric: a new per CPU page frag is introduced and used in __napi_alloc_skb to cope with small allocation requests. To ensure that the above does not lead to excessive truesize underestimation, the frag size for small allocation is inflated to 1K and all the above is restricted to build with 4K page size. Note that we need to update accordingly the run-time check introduced with commit fd9ea57f4e95 ("net: add napi_get_frags_check() helper"). Alex suggested a smart page refcount schema to reduce the number of atomic operations and deal properly with pfmemalloc pages. Under small packet UDP flood, I measure a 15% peak tput increases. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Suggested-by: Alexander H Duyck <alexanderduyck@fb.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/6b6f65957c59f86a353fc09a5127e83a32ab5999.1664350652.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	net: sched: cls_u32: Avoid memcpy() false-positive warning	Kees Cook
	To work around a misbehavior of the compiler's ability to see into composite flexible array structs (as detailed in the coming memcpy() hardening series[1]), use unsafe_memcpy(), as the sizing, bounds-checking, and allocation are all very tightly coupled here. This silences the false-positive reported by syzbot: memcpy: detected field-spanning write (size 80) of single field "&n->sel" at net/sched/cls_u32.c:1043 (size 16) [1] https://lore.kernel.org/linux-hardening/20220901065914.1417829-2-keescook@chromium.org Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jiri Pirko <jiri@resnulli.us> Reported-by: syzbot+a2c4601efc75848ba321@syzkaller.appspotmail.com Link: https://lore.kernel.org/lkml/000000000000a96c0b05e97f0444@google.com/ Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20220927153700.3071688-1-keescook@chromium.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	dt-bindings: net: snps,dwmac: Document stmmac-axi-config subnode	Marek Vasut
	The stmmac-axi-config subnode is present in multiple dwmac instance DTs, document its content per snps,axi-config property description which is a phandle to this subnode. Signed-off-by: Marek Vasut <marex@denx.de> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20220927012449.698915-1-marex@denx.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	docs: netlink: clarify the historical baggage of Netlink flags	Jakub Kicinski
	nlmsg_flags are full of historical baggage, inconsistencies and strangeness. Try to document it more thoroughly. Explain the meaning of the ECHO flag (and while at it clarify the comment in the uAPI). Handwave a little about the NEW request flags and how they make sense on the surface but cater to really old paradigm before commands were a thing. I will add more notes on how to make use of ECHO and discouragement for reuse of flags to the kernel-side documentation. Link: https://lore.kernel.org/r/20220927212306.823862-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	vhost/vsock: Use kvmalloc/kvfree for larger packets.	Junichi Uekawa
	When copying a large file over sftp over vsock, data size is usually 32kB, and kmalloc seems to fail to try to allocate 32 32kB regions. vhost-5837: page allocation failure: order:4, mode:0x24040c0 Call Trace: [<ffffffffb6a0df64>] dump_stack+0x97/0xdb [<ffffffffb68d6aed>] warn_alloc_failed+0x10f/0x138 [<ffffffffb68d868a>] ? __alloc_pages_direct_compact+0x38/0xc8 [<ffffffffb664619f>] __alloc_pages_nodemask+0x84c/0x90d [<ffffffffb6646e56>] alloc_kmem_pages+0x17/0x19 [<ffffffffb6653a26>] kmalloc_order_trace+0x2b/0xdb [<ffffffffb66682f3>] __kmalloc+0x177/0x1f7 [<ffffffffb66e0d94>] ? copy_from_iter+0x8d/0x31d [<ffffffffc0689ab7>] vhost_vsock_handle_tx_kick+0x1fa/0x301 [vhost_vsock] [<ffffffffc06828d9>] vhost_worker+0xf7/0x157 [vhost] [<ffffffffb683ddce>] kthread+0xfd/0x105 [<ffffffffc06827e2>] ? vhost_dev_set_owner+0x22e/0x22e [vhost] [<ffffffffb683dcd1>] ? flush_kthread_worker+0xf3/0xf3 [<ffffffffb6eb332e>] ret_from_fork+0x4e/0x80 [<ffffffffb683dcd1>] ? flush_kthread_worker+0xf3/0xf3 Work around by doing kvmalloc instead. Fixes: 433fc58e6bf2 ("VSOCK: Introduce vhost_vsock.ko") Signed-off-by: Junichi Uekawa <uekawa@chromium.org> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20220928064538.667678-1-uekawa@chromium.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29	clk: fixed-rate: add devm_clk_hw_register_fixed_rate	Dmitry Baryshkov
	Add devm_clk_hw_register_fixed_rate(), devres-managed helper to register fixed-rate clock. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20220916061740.87167-3-dmitry.baryshkov@linaro.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-09-29	clk: asm9260: use parent index to link the reference clock	Dmitry Baryshkov
	Rewrite clk-asm9260 to use parent index to use the reference clock. During this rework two helpers are added: - clk_hw_register_mux_table_parent_data() to supplement clk_hw_register_mux_table() but using parent_data instead of parent_names - clk_hw_register_fixed_rate_parent_accuracy() to be used instead of directly calling __clk_hw_register_fixed_rate(). The later function is an internal API, which is better not to be called directly. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20220916061740.87167-2-dmitry.baryshkov@linaro.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-09-29	Merge tag 'mtk-clk-for-6.1' of ↵	Stephen Boyd
	https://git.kernel.org/pub/scm/linux/kernel/git/wens/linux into clk-mtk Pull MediaTek clk driver updates from Chen-Yu Tsai: A lot of clean up work, as well as new drivers and new functions - New clock drivers for MediaTek Helio X10 MT6795 - Add missing DPI1_HDMI clock in MT8195 VDOSYS1 - Clock driver changes to support GPU DVFS on MT8183, MT8192, MT8195 - Fix GPU clock topology on MT8195 - Propogate rate changes from GPU clock gate up the tree - Clock mux notifiers for GPU-related PLLs - Conversion of more "simple" drivers to mtk_clk_simple_probe() - Hook up mtk_clk_simple_remove() for "simple" MT8192 clock drivers - Fixes to previous \|struct clk\| to \|struct clk_hw\| conversion - Shrink MT8192 clock driver by deduplicating clock parent lists * tag 'mtk-clk-for-6.1' of https://git.kernel.org/pub/scm/linux/kernel/git/wens/linux: (31 commits) clk: mediatek: mt8192: deduplicate parent clock lists clk: mediatek: Migrate remaining clk_unregister_() to clk_hw_unregister_() clk: mediatek: fix unregister function in mtk_clk_register_dividers cleanup clk: mediatek: clk-mt8192: Add clock mux notifier for mfg_pll_sel clk: mediatek: clk-mt8192-mfg: Propagate rate changes to parent clk: mediatek: clk-mt8195-topckgen: Drop univplls from mfg mux parents clk: mediatek: clk-mt8195-topckgen: Add GPU clock mux notifier clk: mediatek: clk-mt8195-topckgen: Register mfg_ck_fast_ref as generic mux clk: mediatek: clk-mt8195-mfg: Reparent mfg_bg3d and propagate rate changes clk: mediatek: mt8183: Add clk mux notifier for MFG mux clk: mediatek: mux: add clk notifier functions clk: mediatek: mt8183: mfgcfg: Propagate rate changes to parent clk: mediatek: Use mtk_clk_register_gates_with_dev in simple probe clk: mediatek: gate: Export mtk_clk_register_gates_with_dev clk: mediatek: add VDOSYS1 clock dt-bindings: clk: mediatek: Add MT8195 DPI clocks clk: mediatek: mt8192: add mtk_clk_simple_remove clk: mediatek: mt8183: use mtk_clk_simple_probe to simplify driver clk: mediatek: mt6797: use mtk_clk_simple_probe to simplify driver clk: mediatek: mt6779: use mtk_clk_simple_probe to simplify driver ...
2022-09-30	drm/msm: Fix build break with recent mm tree	Rob Clark
	9178e3dcb121 ("mm: discard __GFP_ATOMIC") removed __GFP_ATOMIC, replacing it with a check for not __GFP_DIRECT_RECLAIM. Reported-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Rob Clark <robdclark@chromium.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220929161404.2769414-1-robdclark@gmail.com
2022-09-29	sbitmap: fix lockup while swapping	Hugh Dickins
	Commit 4acb83417cad ("sbitmap: fix batched wait_cnt accounting") is a big improvement: without it, I had to revert to before commit 040b83fcecfb ("sbitmap: fix possible io hung due to lost wakeup") to avoid the high system time and freezes which that had introduced. Now okay on the NVME laptop, but 4acb83417cad is a disaster for heavy swapping (kernel builds in low memory) on another: soon locking up in sbitmap_queue_wake_up() (into which __sbq_wake_up() is inlined), cycling around with waitqueue_active() but wait_cnt 0 . Here is a backtrace, showing the common pattern of outer sbitmap_queue_wake_up() interrupted before setting wait_cnt 0 back to wake_batch (in some cases other CPUs are idle, in other cases they're spinning for a lock in dd_bio_merge()): sbitmap_queue_wake_up < sbitmap_queue_clear < blk_mq_put_tag < __blk_mq_free_request < blk_mq_free_request < __blk_mq_end_request < scsi_end_request < scsi_io_completion < scsi_finish_command < scsi_complete < blk_complete_reqs < blk_done_softirq < __do_softirq < __irq_exit_rcu < irq_exit_rcu < common_interrupt < asm_common_interrupt < _raw_spin_unlock_irqrestore < __wake_up_common_lock < __wake_up < sbitmap_queue_wake_up < sbitmap_queue_clear < blk_mq_put_tag < __blk_mq_free_request < blk_mq_free_request < dd_bio_merge < blk_mq_sched_bio_merge < blk_mq_attempt_bio_merge < blk_mq_submit_bio < __submit_bio < submit_bio_noacct_nocheck < submit_bio_noacct < submit_bio < __swap_writepage < swap_writepage < pageout < shrink_folio_list < evict_folios < lru_gen_shrink_lruvec < shrink_lruvec < shrink_node < do_try_to_free_pages < try_to_free_pages < __alloc_pages_slowpath < __alloc_pages < folio_alloc < vma_alloc_folio < do_anonymous_page < __handle_mm_fault < handle_mm_fault < do_user_addr_fault < exc_page_fault < asm_exc_page_fault See how the process-context sbitmap_queue_wake_up() has been interrupted, after bringing wait_cnt down to 0 (and in this example, after doing its wakeups), before advancing wake_index and refilling wake_cnt: an interrupt-context sbitmap_queue_wake_up() of the same sbq gets stuck. I have almost no grasp of all the possible sbitmap races, and their consequences: but __sbq_wake_up() can do nothing useful while wait_cnt 0, so it is better if sbq_wake_ptr() skips on to the next ws in that case: which fixes the lockup and shows no adverse consequence for me. The check for wait_cnt being 0 is obviously racy, and ultimately can lead to lost wakeups: for example, when there is only a single waitqueue with waiters. However, lost wakeups are unlikely to matter in these cases, and a proper fix requires redesign (and benchmarking) of the batched wakeup code: so let's plug the hole with this bandaid for now. Signed-off-by: Hugh Dickins <hughd@google.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/9c2038a7-cdc5-5ee-854c-fbc6168bf16@google.com Signed-off-by: Jens Axboe <axboe@kernel.dk>