linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2022-09-29	dmaengine: idxd: track enabled workqueues in bitmap	Jerry Snitselaar
	Now that idxd_wq_disable_cleanup() sets the workqueue state to IDXD_WQ_DISABLED, use a bitmap to track which workqueues have been enabled. This will then be used to determine which workqueues should be re-enabled when attempting a software reset to recover from a device halt state. Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vinod Koul <vkoul@kernel.org> Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20220928154856.623545-3-jsnitsel@redhat.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29	dmaengine: idxd: Set wq state to disabled in idxd_wq_disable_cleanup()	Jerry Snitselaar
	If we are calling idxd_wq_disable_cleanup(), the workqueue should be in a disabled state. So set the workqueue state to IDXD_WQ_DISABLED so that the state reflects that. Currently if there is a device failure, and a software reset is attempted the workqueues will not be re-enabled due to idxd_wq_enable() seeing that state as already being IDXD_WQ_ENABLED. Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vinod Koul <vkoul@kernel.org> Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20220928154856.623545-2-jsnitsel@redhat.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29	ethernet: s2io: Use skb_put_data() instead of skb_put/memcpy pair	Shang XiaoJing
	Use skb_put_data() instead of skb_put() and memcpy(), which is shorter and clear. Drop the tmp variable that is not needed any more. Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Link: https://lore.kernel.org/r/20220927022802.16050-1-shangxiaojing@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-29	dmaengine: pl08x: Fix double word	Shaomin Deng
	Fix the double word "many" in comments. Signed-off-by: Shaomin Deng <dengshaomin@cdjrlc.com> Link: https://lore.kernel.org/r/20220830150708.24507-1-dengshaomin@cdjrlc.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29	dmaengine: virt-dma: Fix double word in comments	Shaomin Deng
	Delete the double word "many" in comments. Signed-off-by: Shaomin Deng <dengshaomin@cdjrlc.com> Link: https://lore.kernel.org/r/20220825144545.3528-1-dengshaomin@cdjrlc.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29	dmaengine: qcom: gpi: move read_lock_bh to read_lock in tasklet	Tuo Cao
	it is unnecessary to call read_lock_bh in a tasklet. Signed-off-by: Tuo Cao <91tuocao@gmail.com> Link: https://lore.kernel.org/r/20220814131323.7029-1-91tuocao@gmail.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29	dmaengine: mxs: use platform_driver_register	Dario Binacchi
	Driver registration fails on SOC imx8mn as its supplier, the clock control module, is probed later than subsys initcall level. This driver uses platform_driver_probe which is not compatible with deferred probing and won't be probed again later if probe function fails due to clock not being available at that time. This patch replaces the use of platform_driver_probe with platform_driver_register which will allow probing the driver later again when the clock control module will be available. The __init annotation has been dropped because it is not compatible with deferred probing. The code is not executed once and its memory cannot be freed. Fixes: a580b8c5429a ("dmaengine: mxs-dma: add dma support for i.MX23/28") Co-developed-by: Michael Trimarchi <michael@amarulasolutions.com> Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com> Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com> Acked-by: Sascha Hauer <s.hauer@pengutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220921170556.1055962-1-dario.binacchi@amarulasolutions.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29	dt-bindings: phy: qcom,qusb2: document sdm670 compatible	Richard Acayan
	The Snapdragon 670 uses the QUSB driver for USB 2.0. Document the compatible used in the device tree. Signed-off-by: Richard Acayan <mailingradian@gmail.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20220922024656.178529-2-mailingradian@gmail.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29	phy: qcom-qmp-pcie: fix resource mapping for SDM845 QHP PHY	Dmitry Baryshkov
	On SDM845 one of PCIe PHYs (the QHP one) has the same region for TX and RX registers. Since the commit 4be26f695ffa ("phy: qcom-qmp-pcie: fix memleak on probe deferral") added checking that resources are not allocated beforehand, this PHY can not be probed anymore. Fix this by skipping the map of ->rx resource on the QHP PHY and assign it manually. Fixes: 4be26f695ffa ("phy: qcom-qmp-pcie: fix memleak on probe deferral") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Link: https://lore.kernel.org/r/20220926172514.880776-1-dmitry.baryshkov@linaro.org Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29	wifi: rtl8xxxu: Improve rtl8xxxu_queue_select	Bitterblue Smith
	Remove the unused ieee80211_hw* parameter, and pass ieee80211_hdr* instead of relying on skb->data having the right value at the time the function is called. This doesn't change the functionality at all. Fixes: 26f1fad29ad9 ("New driver: rtl8xxxu (mac80211)") Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com> Acked-by: Jes Sorensen <jes@trained-monkey.org> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/2af44c28-1c12-46b9-85b9-011560bf7f7e@gmail.com
2022-09-29	wifi: rtl8xxxu: Fix AIFS written to REG_EDCA_*_PARAM	Bitterblue Smith
	ieee80211_tx_queue_params.aifs is not supposed to be written directly to the REG_EDCA_*_PARAM registers. Instead process it like the vendor drivers do. It's kinda hacky but it works. This change boosts the download speed and makes it more stable. Tested with RTL8188FU but all the other supported chips should also benefit. Fixes: 26f1fad29ad9 ("New driver: rtl8xxxu (mac80211)") Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com> Acked-by: Jes Sorensen <jes@trained-monkey.org> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/038cc03f-3567-77ba-a7bd-c4930e3b2fad@gmail.com
2022-09-29	wifi: rtl8xxxu: gen2: Enable 40 MHz channel width	Bitterblue Smith
	The module parameter ht40_2g was supposed to enable 40 MHz operation, but it didn't. Tell the firmware about the channel width when updating the rate mask. This makes it work with my gen 2 chip RTL8188FU. I'm not sure if anything needs to be done for the gen 1 chips, if 40 MHz channel width already works or not. They update the rate mask with a different structure which doesn't have a field for the channel width. Also set the channel width correctly for sta_statistics. Fixes: f653e69009c6 ("rtl8xxxu: Implement basic 8723b specific update_rate_mask() function") Fixes: bd917b3d28c9 ("rtl8xxxu: fill up txrate info for gen1 chips") Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com> Acked-by: Jes Sorensen <jes@trained-monkey.org> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/3a950997-7580-8a6b-97a0-e0a81a135456@gmail.com
2022-09-29	phy: rockchip-snps-pcie3: only look for rockchip,pipe-grf on rk3588	Aurelien Jarno
	The rockchip,pipe-grf property is only used on rk3588, but not on rk3568. Therefore this property is not present on rk3568 devices, leading to the following message: rockchip-snps-pcie3-phy fe8c0000.phy: failed to find rockchip,pipe_grf regmap Fix that by only looking for this property on rk3588. Fixes: 2e9bffc4f713d ("phy: rockchip: Support PCIe v3") Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Link: https://lore.kernel.org/r/20220927051752.53089-1-aurelien@aurel32.net Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29	ALSA: asihpi - Remove unused struct hpi_subsys_response	Yuan Can
	After commit 3285ea10e9b0("ALSA: asihpi - Interrelated HPI tidy up."), struct hpi_subsys_response is not used any more and can be removed as well. Signed-off-by: Yuan Can <yuancan@huawei.com> Link: https://lore.kernel.org/r/20220928084833.61131-1-yuancan@huawei.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-09-29	ALSA: sb: Use DIV_ROUND_UP() instead of open-coding it	Shang XiaoJing
	Use DIV_ROUND_UP() instead of open-coding it, which intents and makes it more clear what is going on for the casual reviewer. The Coccinelle references Commit e4d8aef21403 ("ALSA: usb: Use DIV_ROUND_UP() instead of open-coding it"). Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com> Link: https://lore.kernel.org/r/20220927141110.18033-1-shangxiaojing@huawei.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-09-29	phy: tegra: xusb: Enable usb role switch attribute	Wayne Chang
	This patch enables the usb-role-switch attribute and lets users check the current device role of the otg capability ports Signed-off-by: Wayne Chang <waynec@nvidia.com> Signed-off-by: Haotien Hsu <haotienh@nvidia.com> Link: https://lore.kernel.org/r/20220928125640.2219402-1-haotienh@nvidia.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29	phy: mediatek: fix build warning of FIELD_PREP()	Chunfeng Yun
	Change the inline function mtk_phy_update_field() into a macro to avoid check warning of FIELD_PREP() with compiler parameter -Wtautological-constant-out-of-range-compare the warning is caused by mask check: "BUILD_BUG_ON_MSG(__bf_cast_unsigned(_mask, _mask) > \" Fixes: 29c07477556e ("phy: mediatek: add a new helper to update bitfield") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20220928070746.5393-1-chunfeng.yun@mediatek.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2022-09-29	xfrm: mip6: add extack to mip6_destopt_init_state, mip6_rthdr_init_state	Sabrina Dubroca
	Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-09-29	xfrm: ipcomp: add extack to ipcomp{4,6}_init_state	Sabrina Dubroca
	And the shared helper ipcomp_init_state. Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-09-29	xfrm: tunnel: add extack to ipip_init_state, xfrm6_tunnel_init_state	Sabrina Dubroca
	Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-09-29	xfrm: esp: add extack to esp_init_state, esp6_init_state	Sabrina Dubroca
	Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-09-29	xfrm: ah: add extack to ah_init_state, ah6_init_state	Sabrina Dubroca
	Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-09-29	xfrm: pass extack down to xfrm_type ->init_state	Sabrina Dubroca
	Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-09-29	clk: mediatek: mt8192: deduplicate parent clock lists	Chen-Yu Tsai
	Some groups of clocks of the same type share the same list of parents. These lists were declared separately for each clock in older drivers, bloating the code. Merge some obvious duplicate parent clock lists in the MT8192 clock driver together to reduce the code size. These include: - apll_i2s_m_parents into one as apll_i2s_m_parents - img1_parents & img2_parents into one as img_parents - msdc30__parents into one as msdc30_parents - camtg_parents into cam_tg_parents - seninf_parents into seninf_parents Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20220926102523.2367530-6-wenst@chromium.org Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
2022-09-29	clk: mediatek: Migrate remaining clk_unregister_() to clk_hw_unregister_()	Chen-Yu Tsai
	During the previous \|struct clk\| to \|struct clk_hw\| clk provider API migration in commit 6f691a586296 ("clk: mediatek: Switch to clk_hw provider APIs"), a few clk_unregister_() calls were missed. Migrate the remaining ones to the \|struct clk_hw\| provider API, i.e. change clk_unregister_() to clk_hw_unregister_*(). Fixes: 6f691a586296 ("clk: mediatek: Switch to clk_hw provider APIs") Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20220926102523.2367530-3-wenst@chromium.org Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
2022-09-29	clk: mediatek: fix unregister function in mtk_clk_register_dividers cleanup	Chen-Yu Tsai
	When the cleanup paths for the various clk register APIs in the MediaTek clk library were added, the one in the dividers type used the wrong type of unregister function. This would result in incorrect dereferencing of the clk pointer and freeing of invalid pointers. Fix this by switching to the correct type of clk unregistration call. Fixes: 3c3ba2ab0226 ("clk: mediatek: mtk: Implement error handling in register APIs") Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20220926102523.2367530-2-wenst@chromium.org Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
2022-09-29	clk: mediatek: clk-mt8192: Add clock mux notifier for mfg_pll_sel	AngeloGioacchino Del Regno
	Following the changes that were done for mt8183, add a clock notifier for the GPU PLL selector mux: this allows safe clock rate changes by temporarily reparenting the GPU to a safe clock (clk26m) while the MFGPLL is reprogrammed and stabilizes. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Chen-Yu Tsai <wenst@chromium.org> Link: https://lore.kernel.org/r/20220927101128.44758-11-angelogioacchino.delregno@collabora.com Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
2022-09-29	clk: mediatek: clk-mt8192-mfg: Propagate rate changes to parent	AngeloGioacchino Del Regno
	Following what was done on MT8183 and MT8195, also propagate the rate changes to MFG_BG3D's parent on MT8192 to allow for proper GPU DVFS. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Chen-Yu Tsai <wenst@chromium.org> Link: https://lore.kernel.org/r/20220927101128.44758-10-angelogioacchino.delregno@collabora.com Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
2022-09-29	clk: mediatek: clk-mt8195-topckgen: Drop univplls from mfg mux parents	AngeloGioacchino Del Regno
	These PLLs are conflicting with GPU rates that can be generated by the GPU-dedicated MFGPLL and would require a special clock handler to be used, for very little and ignorable power consumption benefits. Also, we're in any case unable to set the rate of these PLLs to something else that is sensible for this task, so simply drop them: this will make the GPU to be clocked exclusively from MFGPLL for "fast" rates, while still achieving the right "safe" rate during PLL frequency locking. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Chen-Yu Tsai <wenst@chromium.org> Link: https://lore.kernel.org/r/20220927101128.44758-9-angelogioacchino.delregno@collabora.com Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
2022-09-29	clk: mediatek: clk-mt8195-topckgen: Add GPU clock mux notifier	AngeloGioacchino Del Regno
	Following the changes done to MT8183, register a similar notifier for MT8195 as well, allowing safe clockrate updates for the MFGPLL. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Miles Chen <miles.chen@mediatek.com> Link: https://lore.kernel.org/r/20220927101128.44758-8-angelogioacchino.delregno@collabora.com Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
2022-09-29	clk: mediatek: clk-mt8195-topckgen: Register mfg_ck_fast_ref as generic mux	AngeloGioacchino Del Regno
	This clock was being registered as clk-composite through the helpers for the same in the MediaTek clock APIs but, in reality, this isn't a composite clock. Appropriately register this clock with devm_clk_hw_register_mux(). No functional changes. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Chen-Yu Tsai <wenst@chromium.org> Link: https://lore.kernel.org/r/20220927101128.44758-7-angelogioacchino.delregno@collabora.com Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
2022-09-29	clk: mediatek: clk-mt8195-mfg: Reparent mfg_bg3d and propagate rate changes	AngeloGioacchino Del Regno
	The MFG_BG3D is a gate to enable/disable clock output to the GPU, but the actual output is decided by multiple muxes; in particular: mfg_ck_fast_ref muxes between "slow" (top_mfg_core_tmp) and "fast" (MFGPLL) clock, while top_mfg_core_tmp muxes between the 26MHz clock and various system PLLs. The clock gate comes after all the muxes, so its parent is mfg_ck_fast_reg, not top_mfg_core_tmp. Reparent MFG_BG3D to the latter to match the hardware and add the CLK_SET_RATE_PARENT flag to it: this way we ensure propagating rate changes that are requested on MFG_BG3D along its entire clock tree. Fixes: 35016f10c0e5 ("clk: mediatek: Add MT8195 mfgcfg clock support") Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Chen-Yu Tsai <wenst@chromium.org> Link: https://lore.kernel.org/r/20220927101128.44758-6-angelogioacchino.delregno@collabora.com Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
2022-09-29	clk: mediatek: mt8183: Add clk mux notifier for MFG mux	Chen-Yu Tsai
	When the MFG PLL clock, which is upstream of the MFG clock, is changed, the downstream clock and consumers need to be switched away from the PLL over to a stable clock to avoid glitches. This is done through the use of the newly added clk mux notifier. The notifier is set on the mux itself instead of the upstream PLL, but in practice this works, as the rate change notifitcations are propogated throughout the sub-tree hanging off the PLL. Just before rate changes, the MFG mux is temporarily and transparently switched to the 26 MHz main crystal. After the rate change, the mux is switched back. Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> [Angelo: Rebased to assign clk_ops in mtk_mux_nb] Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Miles Chen <miles.chen@mediatek.com> Link: https://lore.kernel.org/r/20220927101128.44758-5-angelogioacchino.delregno@collabora.com Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
2022-09-29	clk: mediatek: mux: add clk notifier functions	Chen-Yu Tsai
	With device frequency scaling, the mux clock that (indirectly) feeds the device selects between a dedicated PLL, and some other stable clocks. When a clk rate change is requested, the (normally) upstream PLL is reconfigured. It's possible for the clock output of the PLL to become unstable during this process. To avoid causing the device to glitch, the mux should temporarily be switched over to another "stable" clock during the PLL rate change. This is done with clk notifiers. This patch adds common functions for notifiers to temporarily and transparently reparent mux clocks. This was loosely based on commit 8adfb08605a9 ("clk: sunxi-ng: mux: Add clk notifier functions"). Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> [Angelo: Changed mtk_mux_nb to hold a pointer to clk_ops instead of mtk_mux] Co-developed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Miles Chen <miles.chen@mediatek.com> Link: https://lore.kernel.org/r/20220927101128.44758-4-angelogioacchino.delregno@collabora.com Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
2022-09-29	clk: mediatek: mt8183: mfgcfg: Propagate rate changes to parent	Chen-Yu Tsai
	The only clock in the MT8183 MFGCFG block feeds the GPU. Propagate its rate change requests to its parent, so that DVFS for the GPU can work properly. Fixes: acddfc2c261b ("clk: mediatek: Add MT8183 clock support") Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20220927101128.44758-3-angelogioacchino.delregno@collabora.com Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
2022-09-28	Merge tag 'mlx5-updates-2022-09-27' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2022-09-27 This is Part #1 of 4 parts series to align mlx5's implementation of XSK (AF_XDP) RX-Qs indexing and management with other vendors: Maxim Says: =========== xsk: Bug fixes for frame mapping on striding RQ Striding RQ relies on the driver mapping RX buffers into the NIC's virtual memory space. Currently, regadless of the XSK frame size, mlx5e maps them using MTT, and each mapping's length is PAGE_SIZE. As the result, the stride size used by striding RQ is also equal to PAGE_SIZE. This decision has the following issues: 1. In the XSK aligned mode with frame size smaller than PAGE_SIZE, it's suboptimal. Using 2K strides and 2K pages allows to post twice as fewer WQEs. 2. MTT is not suitable for unaligned frames, as it requires natural alignment theoretically, in practice at least 8-byte alignment. 3. Using mapping and stride bigger than the frame has risk of writing over the bounds of the XSK frame upon receiving packets bigger than MTU, which is possible in some specific configurations. This series addresses issues 1 and 2 and alleviates issue 3. Where possible, page and stride size will match the XSK frame size (firmware upgrade may be needed to have effect for 2K frames). Unaligned mode will use KSM instead of MTT, which allows to drop the partial workaround [1]. [1]: https://lore.kernel.org/netdev/YufYFQ6JN91lQbso@boxer/T/ ==================== Link: https://lore.kernel.org/r/20220927203611.244301-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28	net/mlx5e: Use runtime values of striding RQ parameters in datapath	Maxim Mikityanskiy
	Some of the parameters of striding RQ are compile-time constants, but they are going to become dynamically calculated at runtime in a following commit. This commit prepares the datapath to take cached runtime parameters, prefilled at queue creation. New fields added to struct mlx5e_rq fit into an existing 7-byte hole. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28	net/mlx5e: Make dma_info array dynamic in struct mlx5e_mpw_info	Maxim Mikityanskiy
	This commit moves the dma_info array to the end of struct mlx5e_mpw_info to make it a flexible array. It also removes the intermediate struct mlx5e_umr_dma_info, which used to contain only this array. The flexibility of dma_info will allow to choose its size dynamically in a following commit. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28	net/mlx5e: Improve the MTU change shortcut	Maxim Mikityanskiy
	Normally, the MTU change requires reopening the channels, but it can be skipped if the new MTU doesn't change any of the queue parameters and if MTU is not used in the data path. The shortcut is applicable to the non-linear mode of striding RQ, because the only thing affected by MTU is the queue length. As ethtool sets the queue length in packets, but striding RQ length is defined in strides or bytes, we estimate the RQ length to be at least as big as the requested number of MTU-sized packets, that's why it depends on MTU. Improve the shortcut by actually checking whether the RQ length stayed the same, instead of an intermediate step in the calculation. As MTU also affects the SHAMPO parameters, skip the shortcut if SHAMPO is in use. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28	net/mlx5e: xsk: Fix SKB headroom calculation in validation	Maxim Mikityanskiy
	In a typical scenario, if an XSK socket is opened first, then an XDP program is attached, mlx5e_validate_xsk_param will be called twice: first on XSK bind, second on channel restart caused by enabling XDP. The validation includes a call to mlx5e_rx_is_linear_skb, which checks the presence of the XDP program. The above means that mlx5e_rx_is_linear_skb might return true the first time, but false the second time, as mlx5e_rx_get_linear_sz_skb's return value will increase, because of a different headroom used with XDP. As XSK RQs never exist without XDP, it would make sense to trick mlx5e_rx_get_linear_sz_skb into thinking XDP is enabled at the first check as well. This way, if MTU is too big, it would be detected on XSK bind, without giving false hope to the userspace application. However, it turns out that this check is too restrictive in the first place. SKBs created on XDP_PASS on XSK RQs don't have any headroom. That means that big MTUs filtered out on the first and the second checks might actually work. So, address this issue in the proper way, but taking into account the absence of the SKB headroom on XSK RQs, when calculating the buffer size. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28	net/mlx5e: xsk: Remove dead code in validation	Maxim Mikityanskiy
	One of the checks in mlx5e_rx_is_linear_skb verifies that the RX buffer fits into the XSK frame size. Remove the duplicating check from mlx5e_validate_xsk_param. It allows to make mlx5e_rx_get_min_frag_sz static. Remove mlx5e_rx_is_xdp altogether, as its only usage is located in a branch where xsk == NULL. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28	net/mlx5e: Simplify stride size calculation for linear RQ	Maxim Mikityanskiy
	Linear RX buffers must be big enough to fit the MTU-sized packet along with the headroom. On the other hand, they must be small enough to fit into a page (or into an XSK frame). A straightforward way to check whether the linear mode is possible would be comparing the required buffer size to PAGE_SIZE or XSK frame size. Stride size in the linear mode is defined by the following constraints: 1. A stride is at least as big as the buffer size, and it's a power of two. 2. If non-XSK XDP is enabled, the stride size is PAGE_SIZE, because mlx5e requires each packet to be in its own page when XDP is in use. The previous constraint is automatically fulfilled, because buffer size can't be bigger than PAGE_SIZE. 3. XSK uses stride size equal to PAGE_SIZE, but the following commits will allow it to use roundup_pow_of_two(XSK frame size), by allowing the NIC's MMU to use page sizes not equal to the CPU page size. This commit puts the above requirements and constraints straight to the code in an attempt to simplify it and to prepare it for changes made in the next patches. For the reference, the old code uses an equivalent, but trickier calculation (high-level simplified pseudocode): if XDP or XSK: mlx5e_rx_get_linear_frag_sz := max(buffer size, PAGE_SIZE) else: mlx5e_rx_get_linear_frag_sz := buffer size mlx5e_rx_is_linear_skb := mlx5e_rx_get_linear_frag_sz <= PAGE_SIZE stride size := roundup_pow_of_two(mlx5e_rx_get_linear_frag_sz) The new code effectively removes mlx5e_rx_get_linear_frag_sz that used to return either buffer size or stride size, depending on the situation, making it hard to work with and to make changes: if XDP or XSK: mlx5e_rx_get_linear_stride_sz := PAGE_SIZE else mlx5e_rx_get_linear_stride_sz := roundup_pow_of_two(buffer size) mlx5e_rx_is_linear_skb := buffer size <= (PAGE_SIZE or XSK frame sz) stride size := mlx5e_rx_get_linear_stride_sz Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28	net/mlx5e: kTLS, Check ICOSQ WQE size in advance	Maxim Mikityanskiy
	Instead of WARNing in runtime when TLS offload WQEs posted to ICOSQ are over the hardware limit, check their size before enabling TLS RX offload, and block the offload if the condition fails. It also allows to drop a u16 field from struct mlx5e_icosq. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28	net/mlx5e: Use the aligned max TX MPWQE size	Maxim Mikityanskiy
	TX MPWQE size is limited to the cacheline-aligned maximum. Use the same value for the stop room and the capability check. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28	net/mlx5e: Fix a typo in mlx5e_xdp_mpwqe_is_full	Maxim Mikityanskiy
	Fix a typo in the function name: mpqwe -> mpwqe (stands for multi-packet work queue element). Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28	net/mlx5e: Use mlx5e_stop_room_for_max_wqe where appropriate	Maxim Mikityanskiy
	mlx5e_alloc_xdpsq calculates sq->stop_room internally, but there is already a function for that: mlx5e_stop_room_for_max_wqe. This commit makes use of this function. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28	net/mlx5e: Let mlx5e_get_sw_max_sq_mpw_wqebbs accept mdev	Maxim Mikityanskiy
	To shorten and simplify code, let mlx5e_get_sw_max_sq_mpw_wqebbs accept mdev and derive max SQ WQEBBs from it. Also rename the function to a more generic name mlx5e_get_max_sq_aligned_wqebbs, because the following patches will use it in non-MPWQE contexts. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28	net/mlx5e: Validate striding RQ before enabling XDP	Maxim Mikityanskiy
	Currently, the driver can silently fall back to legacy RQ after enabling XDP, even if striding RQ was active before. It happens when PAGE_SIZE is bigger than the maximum supported stride size. This commit changes this behavior to more straightforward: if an operation (enabling XDP) doesn't support the current parameters (striding RQ mode), it fails. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28	net/mlx5e: Make mlx5e_verify_rx_mpwqe_strides static	Maxim Mikityanskiy
	mlx5e_verify_rx_mpwqe_strides is only used in en/params.c, so it can be made static and removed from en/params.h. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28	net/mlx5e: Remove unused fields from datapath structs	Maxim Mikityanskiy
	No need to keep max_sq_wqebbs in mlx5e_txqsq and mlx5e_xdpsq, as it's only used when allocating the queues. Removing an extra field reduces the struct size. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>