summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-02-08arm64: dts: qcom: sm6115: Add watchdog node to dtsiBhupesh Sharma
Add watchdog node in Qualcomm sm6115 SoC dtsi. Signed-off-by: Bhupesh Sharma <bhupesh.sharma@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20230119123200.1021735-1-bhupesh.sharma@linaro.org
2023-02-08arm64: dts: qcom: sc7280-idp: drop incorrect propertiesKrzysztof Kozlowski
The sound card does not expose DAIs and does not use custom qcom properties, so drop '#sound-dai-cells', 'qcom,msm-mbhc-gnd-swh' and 'qcom,msm-mbhc-hphl-swh': sc7280-idp.dtb: sound: '#sound-dai-cells', 'qcom,msm-mbhc-gnd-swh', 'qcom,msm-mbhc-hphl-swh' do not match any of the regexes: '^dai-link@[0-9a-f]$', 'pinctrl-[0-9]+' Reported-by: Srinivasa Rao Mandadapu <quic_srivasam@quicinc.com> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20230119122205.73372-2-krzysztof.kozlowski@linaro.org
2023-02-08arm64: dts: qcom: sc7280-herobrine-audio-wcd9385: drop incorrect propertiesKrzysztof Kozlowski
The sound card does not expose DAIs and does not use custom qcom properties, so drop '#sound-dai-cells', 'qcom,msm-mbhc-gnd-swh' and 'qcom,msm-mbhc-hphl-swh': sc7280-herobrine-crd.dtb: sound: '#sound-dai-cells', 'qcom,msm-mbhc-gnd-swh', 'qcom,msm-mbhc-hphl-swh' do not match any of the regexes: '^dai-link@[0-9a-f]$', 'pinctrl-[0-9]+' Reported-by: Srinivasa Rao Mandadapu <quic_srivasam@quicinc.com> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20230119122205.73372-1-krzysztof.kozlowski@linaro.org
2023-02-08arm64: dts: qcom: sm6115: Use 64 bit addressingKonrad Dybcio
SM6115's SMMU uses 36bit VAs, which is a good indicator that we should increase (dma-)ranges - and by extension #address- and #size-cells to prevent things from getting lost in translation (both literally and figuratively). Do so. Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20230119101644.10711-2-konrad.dybcio@linaro.org
2023-02-08arm64: dts: qcom: sm6115: Add mdss_ prefix to mdss nodesKonrad Dybcio
Add a mdss_ prefix to mdss nodes to keep them all near each other when referencing them by label in device DTs. Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20230119101644.10711-1-konrad.dybcio@linaro.org
2023-02-08ARM: dts: qcom: ipq8064: move reg-less nodes outside soc nodeChristian Marangi
Move node that doesn't have a reg outside the soc node as it should only contain reg nodes. No changes intended. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20230117224417.7530-1-ansuelsmth@gmail.com
2023-02-08arm64: dts: qcom: msm8916-thwc: Add initial device treesYang Xiwen
This commit adds support for the ufi-001C and uf896 WiFi/LTE dongle made by Tong Heng Wei Chuang based on MSM8916. uf896 is another variant for the usb stick. The board design differs by using different gpios for the keys and leds. Note: The original firmware does not support 64-bit OS. It is necessary to flash 64-bit TZ firmware to boot arm64. Currently supported: - All CPU cores - Buttons - LEDs - Modem - SDHC - USB Device Mode - UART Co-developed-by: Jaime Breva <jbreva@nayarsystems.com> Signed-off-by: Jaime Breva <jbreva@nayarsystems.com> Co-developed-by: Nikita Travkin <nikita@trvn.ru> Signed-off-by: Nikita Travkin <nikita@trvn.ru> Signed-off-by: Yang Xiwen <forbidden405@foxmail.com> Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2023-02-08dt-bindings: qcom: Document msm8916-thwc-uf896 and ufi001cYang Xiwen
Document the new thwc,uf896/ufi001c device tree bindings used in their device trees. Signed-off-by: Yang Xiwen <forbidden405@foxmail.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2023-02-08dt-bindings: vendor-prefixes: add thwcYang Xiwen
Shenzhen Tong Heng Wei Chuang Technology Co., Ltd. (hereinafter referred to as "Tong Heng Wei Chuang") is a focus on wireless communications equipment brand manufacturers. Link: http://www.szthwc.com/en/about.html Signed-off-by: Yang Xiwen <forbidden405@foxmail.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2023-02-08arm64: dts: qcom: sm6115: Add geni debug uart node for qup0Bhupesh Sharma
qup0 on sm6115 / sm4250 has 6 SEs, with SE4 as debug uart. Add the debug uart node in sm6115 dtsi file. Cc: Bjorn Andersson <andersson@kernel.org> Cc: Rob Herring <robh+dt@kernel.org> Signed-off-by: Bhupesh Sharma <bhupesh.sharma@linaro.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20230208122718.338545-1-bhupesh.sharma@linaro.org
2023-02-08ARM: pxa: restore mfp-pxa320.hArnd Bergmann
This file was removed in an earlier commit but is actually still needed, so restore it. Reported-by: Randy Dunlap <rdunlap@infradead.org> Fixes: d6df7df7ae5a ("ARM: pxa: remove unused board files") Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-02-08can: bittiming: can_calc_bittiming(): add missing parameter to no-op functionMarc Kleine-Budde
In commit 286c0e09e8e0 ("can: bittiming: can_changelink() pass extack down callstack") a new parameter was added to can_calc_bittiming(), however the static inline no-op (which is used if CONFIG_CAN_CALC_BITTIMING is disabled) wasn't converted. Add the new parameter to the static inline no-op of can_calc_bittiming(). Fixes: 286c0e09e8e0 ("can: bittiming: can_changelink() pass extack down callstack") Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/20230207201734.2905618-1-mkl@pengutronix.de Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2023-02-08can: raw: use temp variable instead of rolling back configOliver Hartkopp
Introduce a temporary variable to check for an invalid configuration attempt from user space. Before this patch the value was copied to the real config variable and rolled back in the case of an error. Suggested-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://lore.kernel.org/all/20230203090807.97100-1-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2023-02-08x86/cpu: Add Lunar Lake MKan Liang
Intel confirmed the existence of this CPU in Q4'2022 earnings presentation. Add the CPU model number. [ dhansen: Merging these as soon as possible makes it easier on all the folks developing model-specific features. ] Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20230208172340.158548-1-tony.luck%40intel.com
2023-02-08x86/kprobes: Fix 1 byte conditional jump targetNadav Amit
Commit 3bc753c06dd0 ("kbuild: treat char as always unsigned") broke kprobes. Setting a probe-point on 1 byte conditional jump can cause the kernel to crash when the (signed) relative jump offset gets treated as unsigned. Fix by replacing the unsigned 'immediate.bytes' (plus a cast) with the signed 'immediate.value' when assigning to the relative jump offset. [ dhansen: clarified changelog ] Fixes: 3bc753c06dd0 ("kbuild: treat char as always unsigned") Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Suggested-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20230208071708.4048-1-namit%40vmware.com
2023-02-08platform/chrome: cros_ec_typec: Fix spelling mistakeColin Ian King
There is a spelling mistake in a dev_warn message, make it lower case and fix the spelling. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Tzung-Bi Shih <tzungbi@kernel.org> Reviewed-by: Guenter Roeck <groeck@chromium.org> Link: https://lore.kernel.org/r/20230207091443.143995-1-colin.i.king@gmail.com Signed-off-by: Prashant Malani <pmalani@chromium.org> [pmalani fixed up commit message based on tzungbi comment]
2023-02-08sfc: move xdp_features configuration in efx_pci_probe_post_io()Lorenzo Bianconi
Move xdp_features configuration from efx_pci_probe() to efx_pci_probe_post_io() since it is where all the other basic netdev features are initialised. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/9bd31c9a29bcf406ab90a249a28fc328e5578fd1.1675875404.git.lorenzo@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-02-08bpf, docs: Add note about type conventionDave Thaler
Add explanation about use of "u64", "u32", etc. as the type convention used in BPF documentation. Signed-off-by: Dave Thaler <dthaler@microsoft.com> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230127014706.1005-1-dthaler1968@googlemail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-08bpf/docs: Update design QA to be consistent with kfunc lifecycle docsToke Høiland-Jørgensen
Cong pointed out that there are some inconsistencies between the BPF design QA and the lifecycle expectations documentation we added for kfuncs. Let's update the QA file to be consistent with the kfunc docs, and add references where it makes sense. Also document that modules may export kfuncs now. v3: - Grammar nit + ack from David v2: - Fix repeated word (s/defined defined/defined/) Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: David Vernet <void@manifault.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20230208164143.286392-1-toke@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-08EDAC/i10nm: Add driver decoder for Sapphire Rapids serverYouquan Song
Intel SDM (December 2022) vol3B 17.13.2 contains IMC MC error codes for Sapphire Rapids. Current i10nm_edac only supports firmware decoder (ACPI DSM methods) for Sapphire Rapids. So add the driver decoder (decoding DDR memory errors via extracting error information from the IMC MC error codes) for Sapphire Rapids for better decoding performance. Co-developed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Youquan Song <youquan.song@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2023-02-08drm/i915: Fix VBT DSI DVO port handlingVille Syrjälä
Turns out modern (icl+) VBTs still declare their DSI ports as MIPI-A and MIPI-C despite the PHYs now being A and B. Remap appropriately to allow the panels declared as MIPI-C to work. Cc: stable@vger.kernel.org Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8016 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230207064337.18697-2-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit 118b5c136c04da705b274b0d39982bb8b7430fc5) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-02-08ASoC: SOF: ops: refine parameters order in function snd_sof_dsp_update8Rander Wang
SOF driver calls snd_sof_dsp_update8 with parameters mask and value but the snd_sof_dsp_update8 declares these two parameters in reverse order. This causes some issues such as d0i3 register can't be set correctly Now change function definition according to common SOF usage. Fixes: c28a36b012f1 ("ASoC: SOF: ops: add snd_sof_dsp_updateb() helper") Signed-off-by: Rander Wang <rander.wang@intel.com> Reviewed-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Chao Song <chao.song@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Link: https://lore.kernel.org/r/20230208104404.20554-1-peter.ujfalusi@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-02-08ARM: dts: at91: sam9x60_curiosity: Add device tree for sam9x60 curiosity boardDurai Manickam KR
Add device tree file for sam9x60 curiosity board. Signed-off-by: Durai Manickam KR <durai.manickamkr@microchip.com> Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20230207110651.197268-9-durai.manickamkr@microchip.com
2023-02-08dt-bindings: arm: at91: Add info on sam9x60 curiosityDurai Manickam KR
Adding the sam9x60 curiosity board from Microchip into the atmel AT91 board description yaml file. Signed-off-by: Durai Manickam KR <durai.manickamkr@microchip.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20230207110651.197268-8-durai.manickamkr@microchip.com
2023-02-08ARM: dts: at91: sam9x60: Add missing flexcom definitionsManikandan Muralidharan
Added the missing flexcom functions for all the flexcom nodes. Signed-off-by: Manikandan Muralidharan <manikandan.m@microchip.com> [durai.manickamkr@microchip.com: added missing UART compatibles] Signed-off-by: Durai Manickam KR <durai.manickamkr@microchip.com> Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20230207110651.197268-7-durai.manickamkr@microchip.com
2023-02-08ARM: dts: at91: sam9x60: Add DMA bindings for the flexcom nodesManikandan Muralidharan
Add dma bindings for flexcom nodes in the soc dtsi file. Users those who don't wish to use the DMA function for their flexcom functions can overwrite the dma bindings in the board device tree file. Signed-off-by: Manikandan Muralidharan <manikandan.m@microchip.com> [durai.manickamkr@microchip.com: fixed code indentation and updated commit log] Signed-off-by: Durai Manickam KR <durai.manickamkr@microchip.com> Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20230207110651.197268-6-durai.manickamkr@microchip.com
2023-02-08ARM: dts: at91: sam9x60: Specify the FIFO size for the Flexcom UARTManikandan Muralidharan
The UART submodule in Flexcom has 16-byte Transmit and Receive FIFOs. Signed-off-by: Manikandan Muralidharan <manikandan.m@microchip.com> Signed-off-by: Durai Manickam KR <durai.manickamkr@microchip.com> Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20230207110651.197268-5-durai.manickamkr@microchip.com
2023-02-08ARM: dts: at91: sam9x60: fix spi4 nodeDurai Manickam KR
The ranges, #address-cells and #size-cells properties are not required, remove them from the spi4 node. Signed-off-by: Durai Manickam KR <durai.manickamkr@microchip.com> Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20230207110651.197268-4-durai.manickamkr@microchip.com
2023-02-08ARM: dts: at91: sam9x60: move flexcom definitionsManikandan Muralidharan
Move the flexcom definitions from board specific DTS file to the SoC specific DTSI file for sam9x60ek. [durai.manickamkr@microchip.com: Logical split-up of this patch and added missing UART5 compatibles] Signed-off-by: Manikandan Muralidharan <manikandan.m@microchip.com> Signed-off-by: Hari Prasath Gujulan Elango <Hari.PrasathGE@microchip.com> Signed-off-by: Durai Manickam KR <durai.manickamkr@microchip.com> Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20230207110651.197268-3-durai.manickamkr@microchip.com
2023-02-08ARM: dts: at91: sam9x60: Fix the label numbering for the flexcom functionsManikandan Muralidharan
Fixed the label numbering of the flexcom functions so that all 13 flexcom functions of sam9x60 are in the following order when the missing flexcom functions are added: flx0: uart0, spi0, i2c0 flx1: uart1, spi1, i2c1 flx2: uart2, spi2, i2c2 flx3: uart3, spi3, i2c3 flx4: uart4, spi4, i2c4 flx5: uart5, spi5, i2c5 flx6: uart6, i2c6 flx7: uart7, i2c7 flx8: uart8, i2c8 flx9: uart9, i2c9 flx10: uart10, i2c10 flx11: uart11, i2c11 flx12: uart12, i2c12 Signed-off-by: Manikandan Muralidharan <manikandan.m@microchip.com> Signed-off-by: Durai Manickam KR <durai.manickamkr@microchip.com> Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com> Link: https://lore.kernel.org/r/20230207110651.197268-2-durai.manickamkr@microchip.com
2023-02-08Merge branch 'taprio-auto-qmaxsdu-new-tx'David S. Miller
Vladimir Oltean says: ==================== taprio automatic queueMaxSDU and new TXQ selection procedure This patch set addresses 2 design limitations in the taprio software scheduler: 1. Software scheduling fundamentally prioritizes traffic incorrectly, in a way which was inspired from Intel igb/igc drivers and does not follow the inputs user space gives (traffic classes and TC to TXQ mapping). Patch 05/15 handles this, 01/15 - 04/15 are preparations for this work. 2. Software scheduling assumes that the gate for a traffic class closes as soon as the next interval begins. But this isn't true. If consecutive schedule entries have that traffic class gate open, there is no "gate close" event and taprio should keep dequeuing from that TC without interruptions. Patches 06/15 - 15/15 handle this. Patch 10/15 is a generic Qdisc change required for this to work. Future development directions which depend on this patch set are: - Propagating the automatic queueMaxSDU calculation down to offloading device drivers, instead of letting them calculate this, as vsc9959_tas_guard_bands_update() does today. - A software data path for tc-taprio with preemptible traffic and Hold/Release events. v1 at: https://patchwork.kernel.org/project/netdevbpf/cover/20230128010719.2182346-1-vladimir.oltean@nxp.com/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08net/sched: taprio: don't segment unnecessarilyVladimir Oltean
Improve commit 497cc00224cf ("taprio: Handle short intervals and large packets") to only perform segmentation when skb->len exceeds what taprio_dequeue() expects. In practice, this will make the biggest difference when a traffic class gate is always open in the schedule. This is because the max_frm_len will be U32_MAX, and such large skb->len values as Kurt reported will be sent just fine unsegmented. What I don't seem to know how to handle is how to make sure that the segmented skbs themselves are smaller than the maximum frame size given by the current queueMaxSDU[tc]. Nonetheless, we still need to drop those, otherwise the Qdisc will hang. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08net/sched: taprio: split segmentation logic from qdisc_enqueue()Vladimir Oltean
The majority of the taprio_enqueue()'s function is spent doing TCP segmentation, which doesn't look right to me. Compilers shouldn't have a problem in inlining code no matter how we write it, so move the segmentation logic to a separate function. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08net/sched: taprio: automatically calculate queueMaxSDU based on TC gate ↵Vladimir Oltean
durations taprio today has a huge problem with small TC gate durations, because it might accept packets in taprio_enqueue() which will never be sent by taprio_dequeue(). Since not much infrastructure was available, a kludge was added in commit 497cc00224cf ("taprio: Handle short intervals and large packets"), which segmented large TCP segments, but the fact of the matter is that the issue isn't specific to large TCP segments (and even worse, the performance penalty in segmenting those is absolutely huge). In commit a54fc09e4cba ("net/sched: taprio: allow user input of per-tc max SDU"), taprio gained support for queueMaxSDU, which is precisely the mechanism through which packets should be dropped at qdisc_enqueue() if they cannot be sent. After that patch, it was necessary for the user to manually limit the maximum MTU per TC. This change adds the necessary logic for taprio to further limit the values specified (or not specified) by the user to some minimum values which never allow oversized packets to be sent. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08net/sched: keep the max_frm_len information inside struct sched_gate_listVladimir Oltean
I have one practical reason for doing this and one concerning correctness. The practical reason has to do with a follow-up patch, which aims to mix 2 sources of max_sdu (one coming from the user and the other automatically calculated based on TC gate durations @current link speed). Among those 2 sources of input, we must always select the smaller max_sdu value, but this can change at various link speeds. So the max_sdu coming from the user must be kept separated from the value that is operationally used (the minimum of the 2), because otherwise we overwrite it and forget what the user asked us to do. To solve that, this patch proposes that struct sched_gate_list contains the operationally active max_frm_len, and q->max_sdu contains just what was requested by the user. The reason having to do with correctness is based on the following observation: the admin sched_gate_list becomes operational at a given base_time in the future. Until then, it is inactive and applies no shaping, all gates are open, etc. So the queueMaxSDU dropping shouldn't apply either (this is a mechanism to ensure that packets smaller than the largest gate duration for that TC don't hang the port; clearly it makes little sense if the gates are always open). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08net/sched: taprio: warn about missing size tableVladimir Oltean
Vinicius intended taprio to take the L1 overhead into account when estimating packet transmission time through user input, specifically through the qdisc size table (man tc-stab). Something like this: tc qdisc replace dev $eth root stab overhead 24 taprio \ num_tc 8 \ map 0 1 2 3 4 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ base-time 0 \ sched-entry S 0x7e 9000000 \ sched-entry S 0x82 1000000 \ max-sdu 0 0 0 0 0 0 0 200 \ flags 0x0 clockid CLOCK_TAI Without the overhead being specified, transmission times will be underestimated and will cause late transmissions. For an offloading driver, it might even cause TX hangs if there is no open gate large enough to send the maximum sized packets for that TC (including L1 overhead). Properly knowing the L1 overhead will ensure that we are able to auto-calculate the queueMaxSDU per traffic class just right, and avoid these hangs due to head-of-line blocking. We can't make the stab mandatory due to existing setups, but we can warn the user that it's important with a warning netlink extack. Link: https://patchwork.kernel.org/project/netdevbpf/patch/20220505160357.298794-1-vladimir.oltean@nxp.com/ Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08net/sched: make stab available before ops->init() callVladimir Oltean
Some qdiscs like taprio turn out to be actually pretty reliant on a well configured stab, to not underestimate the skb transmission time (by properly accounting for L1 overhead). In a future change, taprio will need the stab, if configured by the user, to be available at ops->init() time. It will become even more important in upcoming work, when the overhead will be used for the queueMaxSDU calculation that is passed to an offloading driver. However, rcu_assign_pointer(sch->stab, stab) is called right after ops->init(), making it unavailable, and I don't really see a good reason for that. Move it earlier, which nicely seems to simplify the error handling path as well. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08net/sched: taprio: calculate guard band against actual TC gate close timeVladimir Oltean
taprio_dequeue_from_txq() looks at the entry->end_time to determine whether the skb will overrun its traffic class gate, as if at the end of the schedule entry there surely is a "gate close" event for it. Hint: maybe there isn't. For each schedule entry, introduce an array of kernel times which actually tracks when in the future will there be an *actual* gate close event for that traffic class, and use that in the guard band overrun calculation. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08net/sched: taprio: calculate budgets per traffic classVladimir Oltean
Currently taprio assumes that the budget for a traffic class expires at the end of the current interval as if the next interval contains a "gate close" event for this traffic class. This is, however, an unfounded assumption. Allow schedule entry intervals to be fused together for a particular traffic class by calculating the budget until the gate *actually* closes. This means we need to keep budgets per traffic class, and we also need to update the budget consumption procedure. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08net/sched: taprio: rename close_time to end_timeVladimir Oltean
There is a confusion in terms in taprio which makes what is called "close_time" to be actually used for 2 things: 1. determining when an entry "closes" such that transmitted skbs are never allowed to overrun that time (?!) 2. an aid for determining when to advance and/or restart the schedule using the hrtimer It makes more sense to call this so-called "close_time" "end_time", because it's not clear at all to me what "closes". Future patches will hopefully make better use of the term "to close". This is an absolutely mechanical change. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08net/sched: taprio: calculate tc gate durationsVladimir Oltean
Current taprio code operates on a very simplistic (and incorrect) assumption: that egress scheduling for a traffic class can only take place for the duration of the current interval, or i.o.w., it assumes that at the end of each schedule entry, there is a "gate close" event for all traffic classes. As an example, traffic sent with the schedule below will be jumpy, even though all 8 TC gates are open, so there is absolutely no "gate close" event (effectively a transition from BIT(tc)==1 to BIT(tc)==0 in consecutive schedule entries): tc qdisc replace dev veth0 parent root taprio \ num_tc 2 \ map 0 1 \ queues 1@0 1@1 \ base-time 0 \ sched-entry S 0xff 4000000000 \ clockid CLOCK_TAI \ flags 0x0 This qdisc simply does not have what it takes in terms of logic to *actually* compute the durations of traffic classes. Also, it does not recognize the need to use this information on a per-traffic-class basis: it always looks at entry->interval and entry->close_time. This change proposes that each schedule entry has an array called tc_gate_duration[tc]. This holds the information: "for how long will this traffic class gate remain open, starting from *this* schedule entry". If the traffic class gate is always open, that value is equal to the cycle time of the schedule. We'll also need to keep track, for the purpose of queueMaxSDU[tc] calculation, what is the maximum time duration for a traffic class having an open gate. This gives us directly what is the maximum sized packet that this traffic class will have to accept. For everything else it has to qdisc_drop() it in qdisc_enqueue(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08net/sched: taprio: give higher priority to higher TCs in software dequeue modeVladimir Oltean
Current taprio software implementation is haunted by the shadow of the igb/igc hardware model. It iterates over child qdiscs in increasing order of TXQ index, therefore giving higher xmit priority to TXQ 0 and lower to TXQ N. According to discussions with Vinicius, that is the default (perhaps even unchangeable) prioritization scheme used for the NICs that taprio was first written for (igb, igc), and we have a case of two bugs canceling out, resulting in a functional setup on igb/igc, but a less sane one on other NICs. To the best of my understanding, taprio should prioritize based on the traffic class, so it should really dequeue starting with the highest traffic class and going down from there. We get to the TXQ using the tc_to_txq[] netdev property. TXQs within the same TC have the same (strict) priority, so we should pick from them as fairly as we can. We can achieve that by implementing something very similar to q->curband from multiq_dequeue(). Since igb/igc really do have TXQ 0 of higher hardware priority than TXQ 1 etc, we need to preserve the behavior for them as well. We really have no choice, because in txtime-assist mode, taprio is essentially a software scheduler towards offloaded child tc-etf qdiscs, so the TXQ selection really does matter (not all igb TXQs support ETF/SO_TXTIME, says Kurt Kanzenbach). To preserve the behavior, we need a capability bit so that taprio can determine if it's running on igb/igc, or on something else. Because igb doesn't offload taprio at all, we can't piggyback on the qdisc_offload_query_caps() call from taprio_enable_offload(), but instead we need a separate call which is also made for software scheduling. Introduce two static keys to minimize the performance penalty on systems which only have igb/igc NICs, and on systems which only have other NICs. For mixed systems, taprio will have to dynamically check whether to dequeue using one prioritization algorithm or using the other. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08net/sched: taprio: avoid calling child->ops->dequeue(child) twiceVladimir Oltean
Simplify taprio_dequeue_from_txq() by noticing that we can goto one call earlier than the previous skb_found label. This is possible because we've unified the treatment of the child->ops->dequeue(child) return call, we always try other TXQs now, instead of abandoning the root dequeue completely if we failed in the peek() case. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08net/sched: taprio: refactor one skb dequeue from TXQ to separate functionVladimir Oltean
Future changes will refactor the TXQ selection procedure, and a lot of stuff will become messy, the indentation of the bulk of the dequeue procedure would increase, etc. Break out the bulk of the function into a new one, which knows the TXQ (child qdisc) we should perform a dequeue from. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08net/sched: taprio: continue with other TXQs if one dequeue() failedVladimir Oltean
This changes the handling of an unlikely condition to not stop dequeuing if taprio failed to dequeue the peeked skb in taprio_dequeue(). I've no idea when this can happen, but the only side effect seems to be that the atomic_sub_return() call right above will have consumed some budget. This isn't a big deal, since either that made us remain without any budget (and therefore, we'd exit on the next peeked skb anyway), or we could send some packets from other TXQs. I'm making this change because in a future patch I'll be refactoring the dequeue procedure to simplify it, and this corner case will have to go away. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08net/sched: taprio: delete peek() implementationVladimir Oltean
There isn't any code in the network stack which calls taprio_peek(). We only see qdisc->ops->peek() being called on child qdiscs of other classful qdiscs, never from the generic qdisc code. Whereas taprio is never a child qdisc, it is always root. This snippet of a comment from qdisc_peek_dequeued() seems to confirm: /* we can reuse ->gso_skb because peek isn't called for root qdiscs */ Since I've been known to be wrong many times though, I'm not completely removing it, but leaving a stub function in place which emits a warning. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08Merge branch 'mptcp-fixes'David S. Miller
Matthieu Baerts says: ==================== mptcp: fixes for v6.2 Patch 1 clears resources earlier if there is no more reasons to keep MPTCP sockets alive. Patches 2 and 3 fix some locking issues visible in some rare corner cases: the linked issues should be quite hard to reproduce. Patch 4 makes sure subflows are correctly cleaned after the end of a connection. Patch 5 and 6 improve the selftests stability when running in a slow environment by transfering data for a longer period on one hand and by stopping the tests when all expected events have been observed on the other hand. All these patches fix issues introduced before v6.2. ==================== Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08selftests: mptcp: stop tests earlierMatthieu Baerts
These 'endpoint' tests from 'mptcp_join.sh' selftest start a transfer in the background and check the status during this transfer. Once the expected events have been recorded, there is no reason to wait for the data transfer to finish. It can be stopped earlier to reduce the execution time by more than half. For these tests, the exchanged data were not verified. Errors, if any, were ignored but that's fine, plenty of other tests are looking at that. It is then OK to mute stderr now that we are sure errors will be printed (and still ignored) because the transfer is stopped before the end. Fixes: e274f7154008 ("selftests: mptcp: add subflow limits test-cases") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08selftests: mptcp: allow more slack for slow test-casePaolo Abeni
A test-case is frequently failing on some extremely slow VMs. The mptcp transfer completes before the script is able to do all the required PM manipulation. Address the issue in the simplest possible way, making the transfer even more slow. Additionally dump more info in case of failures, to help debugging similar problems in the future and init dump_stats var. Fixes: e274f7154008 ("selftests: mptcp: add subflow limits test-cases") Cc: stable@vger.kernel.org Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/323 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08mptcp: be careful on subflow status propagation on errorsPaolo Abeni
Currently the subflow error report callback unconditionally propagates the fallback subflow status to the owning msk. If the msk is already orphaned, the above prevents the code from correctly tracking the msk moving to the TCP_CLOSE state and doing the appropriate cleanup. All the above causes increasing memory usage over time and sporadic self-tests failures. There is a great deal of infrastructure trying to propagate correctly the fallback subflow status to the owning mptcp socket, e.g. via mptcp_subflow_eof() and subflow_sched_work_if_closed(): in the error propagation path we need only to cope with unorphaned sockets. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/339 Fixes: 15cc10453398 ("mptcp: deliver ssk errors to msk") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>