summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-06-28fs: use guard for namespace_sem in statmount()Christian Brauner
Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-06-28fs: export mount options via statmount()Josef Bacik
statmount() can export arbitrary strings, so utilize the __spare1 slot for a mnt_opts string pointer, and then support asking for and setting the mount options during statmount(). This calls into the helper for showing mount options, which already uses a seq_file, so fits in nicely with our existing mechanism for exporting strings via statmount(). Signed-off-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/3aa6bf8bd5d0a21df9ebd63813af8ab532c18276.1719257716.git.josef@toxicpanda.com Reviewed-by: Jeff Layton <jlayton@kernel.org> [brauner: only call sb->s_op->show_options()] Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-06-28Merge patch series "Support foreign mount namespace with statmount/listmount"Christian Brauner
Josef Bacik <josef@toxicpanda.com> says: Currently the only way to iterate over mount entries in mount namespaces that aren't your own is to trawl through /proc in order to find /proc/$PID/mountinfo for the mount namespace that you want. This is hugely inefficient, so extend both statmount() and listmount() to allow specifying a mount namespace id in order to get to mounts in other mount namespaces. There are a few components to this 1. Having a global index of the mount namespace based on the ->seq value in the mount namespace. This gives us a unique identifier that isn't re-used. 2. Support looking up mount namespaces based on that unique identifier, and validating the user has permission to access the given mount namespace. 3. Provide a new ioctl() on nsfs in order to extract the unique identifier we can use for statmount() and listmount(). The code is relatively straightforward, and there is a selftest provided to validate everything works properly. This is based on vfs.all as of last week, so must be applied onto a tree that has Christians error handling rework in this area. If you wish you can pull the tree directly here https://github.com/josefbacik/linux/tree/listmount.combined Christian and I collaborated on this series, which is why there's patches from both of us in this series. Christian Brauner (4): fs: relax permissions for listmount() fs: relax permissions for statmount() fs: Allow listmount() in foreign mount namespace fs: Allow statmount() in foreign mount namespace Josef Bacik (4): fs: keep an index of current mount namespaces fs: export the mount ns id via statmount fs: add an ioctl to get the mnt ns id from nsfs selftests: add a test for the foreign mnt ns extensions fs/mount.h | 2 + fs/namespace.c | 240 ++++++++++-- fs/nsfs.c | 14 + include/uapi/linux/mount.h | 6 +- include/uapi/linux/nsfs.h | 2 + .../selftests/filesystems/statmount/Makefile | 2 +- .../filesystems/statmount/statmount.h | 46 +++ .../filesystems/statmount/statmount_test.c | 53 +-- .../filesystems/statmount/statmount_test_ns.c | 360 ++++++++++++++++++ 9 files changed, 659 insertions(+), 66 deletions(-) create mode 100644 tools/testing/selftests/filesystems/statmount/statmount.h create mode 100644 tools/testing/selftests/filesystems/statmount/statmount_test_ns.c Link: https://lore.kernel.org/r/cover.1719243756.git.josef@toxicpanda.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-06-28fs: rename show_mnt_opts -> show_vfsmnt_optsJosef Bacik
This name is more consistent with what the helper does, which is to just show the vfsmount options. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/fb363c62ffbf78a18095d596a19b8412aa991251.1719257716.git.josef@toxicpanda.com Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-06-28selftests: add a test for the foreign mnt ns extensionsJosef Bacik
This tests both statmount and listmount to make sure they work with the extensions that allow us to specify a mount ns to enter in order to find the mount entries. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/2d1a35bc9ab94b4656c056c420f25e429e7eb0b1.1719243756.git.josef@toxicpanda.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-06-28MAINTAINERS: Update IOMMU tree locationJoerg Roedel
Update the maintainers entries to the new location of the IOMMU tree. Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-06-28drm/mediatek: dpi/dsi: Fix possible_crtcs calculationMichael Walle
mtk_find_possible_crtcs() assumes that the main path will always have the CRTC with id 0, the ext id 1 and the third id 2. This is only true if the paths are all available. But paths are optional (see also comment in mtk_drm_kms_init()), e.g. the main path might not be enabled or available at all. Then the CRTC IDs will shift one up, e.g. ext will be 0 and the third path will be 1. To fix that, dynamically calculate the IDs by the presence of the paths. While at it, make the return code a signed one and return -ENODEV if no path is found and handle the error in the callers. Fixes: 5aa8e7647676 ("drm/mediatek: dpi/dsi: Change the getting possible_crtc way") Suggested-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Signed-off-by: Michael Walle <mwalle@kernel.org> Link: https://patchwork.kernel.org/project/dri-devel/patch/20240606092122.2026313-1-mwalle@kernel.org/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2024-06-28Merge tag 'ieee802154-for-net-2024-06-27' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan into main
2024-06-28powerpc/pseries: Fix scv instruction crash with kexecNicholas Piggin
kexec on pseries disables AIL (reloc_on_exc), required for scv instruction support, before other CPUs have been shut down. This means they can execute scv instructions after AIL is disabled, which causes an interrupt at an unexpected entry location that crashes the kernel. Change the kexec sequence to disable AIL after other CPUs have been brought down. As a refresher, the real-mode scv interrupt vector is 0x17000, and the fixed-location head code probably couldn't easily deal with implementing such high addresses so it was just decided not to support that interrupt at all. Fixes: 7fa95f9adaee ("powerpc/64s: system call support for scv/rfscv instructions") Cc: stable@vger.kernel.org # v5.9+ Reported-by: Sourabh Jain <sourabhjain@linux.ibm.com> Closes: https://lore.kernel.org/3b4b2943-49ad-4619-b195-bc416f1d1409@linux.ibm.com Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Gautam Menghani <gautam@linux.ibm.com> Tested-by: Sourabh Jain <sourabhjain@linux.ibm.com> Link: https://msgid.link/20240625134047.298759-1-npiggin@gmail.com Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2024-06-28arm64: dts: rockchip: Delete the SoC variant dtsi for RK3399ProDragan Simic
The commit 587b4ee24fc7 ("arm64: dts: rockchip: add core dtsi file for RK3399Pro SoCs") describes the RK3399Pro's PCI Express interface as the way built-in NPU communicates with the rest of the SoC. All available evidence shows this not to be accurate, as described in detail below. Moreover, the rk3399pro.dtsi isn't used anywhere, so let's delete it. The publicly available schematics of the Radxa Rock Pi N10 carrier board [1] and the Vamrs VMARC RK3399Pro SoM, [2] which put together form the currently single supported RK3399Pro-based board, clearly show that the PCI Express x4 interface of this SoC is fully functional and actually not used by the SoC to communicate with the built-in NPU. In more detail, the VMARC SoM exports the SoC's PCI Express interface at its board-to-board connector, and the Rock Pi N10 routes it to an M.2 M-key slot with four PCI Express lanes. Both the Rockchip RK3399Pro datasheet, version 1.1, [3] and the Rockchip RK3399Pro technical reference manual (TRM), first part of the version 1.0, [4] don't describe that the SoC's PCI Express interface is reserved for the NPU. Instead, the RK3399Pro TRM describes that the NPU uses AHB and AXI interfaces as the host interface (HIF). The RK3399Pro datasheet clearly describes that the PCI Express x4 interface is available for general-purpose use, just like it's the case with the standard Rockchip RK3399 SoC, [5] albeit with a bit shorter feature list provided in the RK3399Pro datasheet. Even the publicly available reference RK3399Pro schematic from Rockchip [6] shows the availability of a standard PCI Express slot with four lanes, which would be pretty much impossible if the PCI Express interface was reserved for the communication with the built-in NPU. Based on the RK3399Pro datasheet [3] and the board schematics, [2][6] the built-in NPU actually exports NPU_PCIE as a separate PCI Express x2 interface that's partially pinmuxed with the NPU's separate USB 3.0 interface, which is described further in the next paragraph. However, the NPU's separate PCI Express x2 interface is left undocumented in the publicly available RK3399Pro documentation, in which it's clearly described as reserved for internal use and not intended for the communication with the NPU. Finally, the evidently independent nature of the separate NPU_PCIE x2 interface makes ignoring it safe when it comes to determining the nature and the availability of the RK3399Pro's main PCI Express x4 interface. The actual application-level communication with the built-in NPU, including powering it up and down and uploading the NPU firmware, is performed through the separate USB 2.0 and USB 3.0 interfaces exported by the NPU, [7] which the VMARC SoM [2] and the reference board design from Rockchip [6] route to the SoC's standard USB 2.0 and USB 3.0 interfaces, to make the NPU accessible to software running on the SoC's ARM cores. [1] https://dl.radxa.com/rockpin10/docs/hw/rockpi_n10_sch_v1.1_20190909.pdf [2] https://dl.radxa.com/rockpin10/docs/hw/VMARC_RK3399Pro_sch_V1.1_20190619.pdf [3] https://www.rockchip.fr/RK3399Pro%20datasheet%20V1.1.pdf [4] https://www.rockchip.fr/Rockchip%20RK3399Pro%20TRM%20V1.0%20Part1.pdf [5] https://www.rockchip.fr/RK3399%20datasheet%20V1.8.pdf [6] https://opensource.rock-chips.com/images/e/e4/RK_EVB_RK3399PRO_LP3S178P332SD8_V11_20181113_RZF.pdf [7] https://wiki.radxa.com/RockpiN10/dev/NPU-booting Signed-off-by: Dragan Simic <dsimic@manjaro.org> Link: https://lore.kernel.org/r/4449f7d4eead787308300e2d1d37b88c9d1446b2.1717308862.git.dsimic@manjaro.org Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2024-06-28arm64: dts: rockchip: Fix mic-in-differential usage on rk3568-evb1-v10Cristian Ciocaltea
The 'mic-in-differential' DT property supported by the RK809/RK817 audio codec driver is actually valid if prefixed with 'rockchip,': DTC_CHK arch/arm64/boot/dts/rockchip/rk3568-evb1-v10.dtb rk3568-evb1-v10.dtb: pmic@20: codec: 'mic-in-differential' does not match any of the regexes: 'pinctrl-[0-9]+' from schema $id: http://devicetree.org/schemas/mfd/rockchip,rk809.yaml# Make use of the correct property name. Fixes: 3e4c629ca680 ("arm64: dts: rockchip: enable rk809 audio codec on the rk3568 evb1-v10") Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com> Link: https://lore.kernel.org/r/20240622-rk809-fixes-v2-5-c0db420d3639@collabora.com Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2024-06-28arm64: dts: rockchip: Fix mic-in-differential usage on rk3566-roc-pcCristian Ciocaltea
The 'mic-in-differential' DT property supported by the RK809/RK817 audio codec driver is actually valid if prefixed with 'rockchip,': DTC_CHK arch/arm64/boot/dts/rockchip/rk3566-roc-pc.dtb rk3566-roc-pc.dtb: pmic@20: codec: 'mic-in-differential' does not match any of the regexes: 'pinctrl-[0-9]+' from schema $id: http://devicetree.org/schemas/mfd/rockchip,rk809.yaml# Make use of the correct property name. Fixes: a8e35c4bebe4 ("arm64: dts: rockchip: add audio nodes to rk3566-roc-pc") Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com> Link: https://lore.kernel.org/r/20240622-rk809-fixes-v2-4-c0db420d3639@collabora.com Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2024-06-28arm64: dts: rockchip: Drop invalid mic-in-differential on rk3568-rock-3aCristian Ciocaltea
The 'mic-in-differential' DT property supported by the RK809/RK817 audio codec driver is actually valid if prefixed with 'rockchip,': DTC_CHK arch/arm64/boot/dts/rockchip/rk3568-rock-3a.dtb rk3568-rock-3a.dtb: pmic@20: codec: 'mic-in-differential' does not match any of the regexes: 'pinctrl-[0-9]+' from schema $id: http://devicetree.org/schemas/mfd/rockchip,rk809.yaml# However, the board doesn't make use of differential signaling, hence drop the incorrect property and the now unnecessary 'codec' node. Fixes: 22a442e6586c ("arm64: dts: rockchip: add basic dts for the radxa rock3 model a") Reported-by: Jonas Karlman <jonas@kwiboo.se> Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com> Link: https://lore.kernel.org/r/20240622-rk809-fixes-v2-3-c0db420d3639@collabora.com Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2024-06-28arm64: dts: rockchip: Add rock5b overlays for PCIe endpoint modeNiklas Cassel
Add rock5b overlays for PCIe endpoint mode support. If using the rock5b as an endpoint against a normal PC, only the rk3588-rock-5b-pcie-ep.dtbo needs to be applied. If using two rock5b:s, with one board as EP and the other board as RC, rk3588-rock-5b-pcie-ep.dtbo and rk3588-rock-5b-pcie-srns.dtbo has to be applied to the respective boards. Signed-off-by: Niklas Cassel <cassel@kernel.org> Link: https://lore.kernel.org/r/20240607-rockchip-pcie-ep-v1-v5-13-0a042d6b0049@kernel.org Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2024-06-28arm64: dts: rockchip: Add PCIe endpoint mode supportNiklas Cassel
Add a device tree node representing PCIe endpoint mode. The controller can either be configured to run in Root Complex or Endpoint mode. If a user wants to run the controller in endpoint mode, the user has to disable the pcie3x4 node and enable the pcie3x4_ep node. Signed-off-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20240607-rockchip-pcie-ep-v1-v5-12-0a042d6b0049@kernel.org Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2024-06-28Merge branch 'mlx5-fixes' into mainDavid S. Miller
Tariq Toukan says: ==================== mlx5 fixes 2024-06-27 This patchset provides fixes from the team to the mlx5 core and EN drivers. The first 3 patches by Daniel replace a buggy cap field with a newly introduced one. Patch 4 by Chris de-couples ingress ACL creation from a specific flow, so it's invoked by other flows if needed. Patch 5 by Jianbo fixes a possible missing cleanup of QoS objects. Patches 6 and 7 by Leon fixes IPsec stats logic to better reflect the traffic. Series generated against: commit 02ea312055da ("octeontx2-pf: Fix coverity and klockwork issues in octeon PF driver") V2: Fixed wrong cited SHA in patch 6. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28net/mlx5e: Approximate IPsec per-SA payload data bytes countLeon Romanovsky
ConnectX devices lack ability to count payload data byte size which is needed for SA to return to libreswan for rekeying. As a solution let's approximate that by decreasing headers size from total size counted by flow steering. The calculation doesn't take into account any other headers which can be in the packet (e.g. IP extensions). Fixes: 5a6cddb89b51 ("net/mlx5e: Update IPsec per SA packets/bytes count") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28net/mlx5e: Present succeeded IPsec SA bytes and packetLeon Romanovsky
IPsec SA statistics presents successfully decrypted and encrypted packet and bytes, and not total handled by this SA. So update the calculation logic to take into account failures. Fixes: 6fb7f9408779 ("net/mlx5e: Connect mlx5 IPsec statistics with XFRM core") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28net/mlx5e: Add mqprio_rl cleanup and free in mlx5e_priv_cleanup()Jianbo Liu
In the cited commit, mqprio_rl cleanup and free are mistakenly removed in mlx5e_priv_cleanup(), and it causes the leakage of host memory and firmware SCHEDULING_ELEMENT objects while changing eswitch mode. So, add them back. Fixes: 0bb7228f7096 ("net/mlx5e: Fix mqprio_rl handling on devlink reload") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28net/mlx5: E-switch, Create ingress ACL when neededChris Mi
Currently, ingress acl is used for three features. It is created only when vport metadata match and prio tag are enabled. But active-backup lag mode also uses it. It is independent of vport metadata match and prio tag. And vport metadata match can be disabled using the following devlink command: # devlink dev param set pci/0000:08:00.0 name esw_port_metadata \ value false cmode runtime If ingress acl is not created, will hit panic when creating drop rule for active-backup lag mode. If always create it, there will be about 5% performance degradation. Fix it by creating ingress acl when needed. If esw_port_metadata is true, ingress acl exists, then create drop rule using existing ingress acl. If esw_port_metadata is false, create ingress acl and then create drop rule. Fixes: 1749c4c51c16 ("net/mlx5: E-switch, add drop rule support to ingress ACL") Signed-off-by: Chris Mi <cmi@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28net/mlx5: Use max_num_eqs_24b when setting max_io_eqsDaniel Jurgens
Due a bug in the device max_num_eqs doesn't always reflect a written value. As a result, setting max_io_eqs may not work but appear successful. Instead write max_num_eqs_24b, which reflects correct value. Fixes: 93197c7c509d ("mlx5/core: Support max_io_eqs for a function") Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28net/mlx5: Use max_num_eqs_24b capability if setDaniel Jurgens
A new capability with more bits is added. If it's set use that value as the maximum number of EQs available. This cap is also writable by the vhca_resource_manager to allow limiting the number of EQs available to SFs and VFs. Fixes: 93197c7c509d ("mlx5/core: Support max_io_eqs for a function") Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Reviewed-by: William Tu <witu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28net/mlx5: IFC updates for changing max EQsDaniel Jurgens
Expose new capability to support changing the number of EQs available to other functions. Fixes: 93197c7c509d ("mlx5/core: Support max_io_eqs for a function") Signed-off-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Reviewed-by: William Tu <witu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28Merge branch 'net-selftests-mirroring-cleanup' into mainDavid S. Miller
Petr Machata says: ==================== selftest: Clean-up and stabilize mirroring tests The mirroring selftests work by sending ICMP traffic between two hosts. Along the way, this traffic is mirrored to a gretap netdevice, and counter taps are then installed strategically along the path of the mirrored traffic to verify the mirroring took place. The problem with this is that besides mirroring the primary traffic, any other service traffic is mirrored as well. At the same time, because the tests need to work in HW-offloaded scenarios, the ability of the device to do arbitrary packet inspection should not be taken for granted. Most tests therefore simply use matchall, one uses flower to match on IP address. As a result, the selftests are noisy. mirror_test() accommodated this noisiness by giving the counters an allowance of several packets. But that only works up to a point, and on busy systems won't be always enough. In this patch set, clean up and stabilize the mirroring selftests. The original intention was to port the tests over to UDP, but the logic of ICMP ends up being so entangled in the mirroring selftests that the changes feel overly invasive. Instead, ICMP is kept, but where possible, we match on ICMP message type, thus filtering out hits by other ICMP messages. Where this is not practical (where the counter tap is put on a device that carries encapsulated packets), switch the counter condition to _at least_ X observed packets. This is less robust, but barely so -- probably the only scenario that this would not catch is something like erroneous packet duplication, which would hopefully get caught by the numerous other tests in this extensive suite. - Patches #1 to #3 clean up parameters at various helpers. - Patches #4 to #6 stabilize the mirroring selftests as described above. - Mirroring tests currently allow testing SW datapath even on HW netdevices by trapping traffic to the SW datapath. This complicates the tests a bit without a good reason: to test SW datapath, just run the selftests on the veth topology. Thus in patch #7, drop support for this dual SW/HW testing. - At this point, some cleanups were either made possible by the previous patches, or were always possible. In patches #8 to #11, realize these cleanups. - In patch #12, fix mlxsw mirror_gre selftest to respect setting TESTS. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28selftests: mlxsw: mirror_gre: Obey TESTSPetr Machata
This test is unusual in that overriding TESTS does not change the tests to be run. Split the individual tests into several functions and invoke them through tests_run() as appropriate. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28selftests: libs: Drop unused functionsPetr Machata
Nothing calls these. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28selftests: libs: Drop slow_path_trap_install()/_uninstall()Petr Machata
These functions are not used anymore. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28selftests: mirror_gre_lag_lacp: Drop unnecessary codePetr Machata
The selftest does not use functions from mirror_gre_lib, ditch the import. It does not use arping either, so drop the require_command as well. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28selftests: mlxsw: mirror_gre: SimplifyPetr Machata
After the previous patch, the function test_span_failable() is always called with should_fail=1. Drop the argument and streamline the code. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28selftests: mirror: Drop dual SW/HW testingPetr Machata
The mirroring tests are currently run in a skip_hw and optionally a skip_sw mode. The former tests the SW datapath, the latter the HW datapath, if available. In order to be able to test SW datapath on HW loopbacks, traps are installed on ingress to get traffic from the HW datapath to the SW one. This adds an unnecessary complexity when it would be much simpler to just use a veth-based topology to test the SW datapath. Thus drop all the code that supports this dual testing. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28selftests: mirror: mirror_test(): Allow exact count of packetsPetr Machata
The mirroring selftests work by sending ICMP traffic between two hosts. Along the way, this traffic is mirrored to a gretap netdevice, and counter taps are then installed strategically along the path of the mirrored traffic to verify the mirroring took place. The problem with this is that besides mirroring the primary traffic, any other service traffic is mirrored as well. At the same time, because the tests need to work in HW-offloaded scenarios, the ability of the device to do arbitrary packet inspection should not be taken for granted. Most tests therefore simply use matchall, one uses flower to match on IP address. As a result, the selftests are noisy, because besides the primary ICMP traffic, any amount of other service traffic is mirrored as well. mirror_test() accommodated this noisiness by giving the counters an allowance of several packets. But in the previous patch, where possible, counter taps were changed to match only on an exact ICMP message. At least in those cases, we can demand an exact number of packets to match. Where the tap is installed on a connective netdevice, the exact matching is not practical (though with u32, anything is possible). In those places, there should still be some leeway -- and probably bigger than before, because experience shows that these tests are very noisy. To that end, change mirror_test() so that it can be either called with an exact number to expect, or with an expression. Where leeway is needed, adjust callers to pass a ">= 10" instead of mere 10. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28selftests: mirror: do_test_span_dir_ips(): Install accurate tapsPetr Machata
The mirroring selftests work by sending ICMP traffic between two hosts. Along the way, this traffic is mirrored to a gretap netdevice, and counter taps are then installed strategically along the path of the mirrored traffic to verify the mirroring took place. The problem with this is that besides mirroring the primary traffic, any other service traffic is mirrored as well. At the same time, because the tests need to work in HW-offloaded scenarios, the ability of the device to do arbitrary packet inspection should not be taken for granted. Most tests therefore simply use matchall, one uses flower to match on IP address. As a result, the selftests are noisy, because besides the primary ICMP traffic, any amount of other service traffic is mirrored as well. However, often the counter tap is installed at the remote end of the gretap tunnel. Since this is a SW-datapath scenario anyway, we can make the filter arbitrarily accurate. Thus in this patch, add parameters forward_type and backward_type to several mirroring test helpers, as some other helpers already have. Then change do_test_span_dir_ips() to instead of installing one generic tap and using it for test in both directions, install the tap for each direction separately, matching on the ICMP type given by these parameters. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28selftests: mirror_gre_lag_lacp: Check counters at tunnelPetr Machata
The test works by sending packets through a tunnel, whence they are forwarded to a LAG. One of the LAG children is removed from the LAG prior to the exercise, and the test then counts how many packets pass through the other one. The issue with this is that it counts all packets, not just the encapsulated ones. So instead add a second gretap endpoint to receive the sent packets, and check reception counters there. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28selftests: lib: tc_rule_stats_get(): Move default to argument definitionPetr Machata
The argument $dir has a fallback value of "ingress". Move the fallback from the usage site to the argument definition block to make the fact clearer. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28selftests: mirror: Drop direction argument from several functionsPetr Machata
The argument is not used by these functions except to propagate it for ultimately no purpose. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28selftests: libs: Expand "$@" where possiblePetr Machata
In some functions, argument-forwarding through "$@" without listing the individual arguments explicitly is fundamental to the operation of a function. E.g. xfail_on_veth() should be able to run various tests in the fail-to-xfail regime, and usage of "$@" is appropriate as an abstraction mechanism. For functions such as simple_if_init(), $@ is a handy way to pass an array. In other functions, it's merely a mechanism to save some typing, which however ends up obscuring the real arguments and makes life hard for those that end up reading the code. This patch adds some of the implicit function arguments and correspondingly expands $@'s. In several cases this will come in handy as following patches adjust the parameter lists. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28drm/i915/mtl: Skip PLL state verification in TBT modeImre Deak
In TBT-alt mode the driver doesn't program the PHY's PLL, which is handled instead by Thunderbolt driver/FW components, hence the PLL's HW vs. SW state verification should be skipped. During HW readout set a flag in the PLL state if the port was at the moment in TBT-alt mode and skip the verification of PLL parameters in this case. Fixes: 45fe957ae769 ("drm/i915/display: Add compare config for MTL+ platforms") Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11258 Cc: Mika Kahola <mika.kahola@intel.com> Reviewed-by: Mika Kahola <mika.kahola@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240626170813.806470-1-imre.deak@intel.com
2024-06-28Merge branch 'net-flash-modees-firmware' into mainDavid S. Miller
Danielle Ratson says: ==================== Add ability to flash modules' firmware CMIS compliant modules such as QSFP-DD might be running a firmware that can be updated in a vendor-neutral way by exchanging messages between the host and the module as described in section 7.2.2 of revision 4.0 of the CMIS standard. According to the CMIS standard, the firmware update process is done using a CDB commands sequence. CDB (Command Data Block Message Communication) reads and writes are performed on memory map pages 9Fh-AFh according to the CMIS standard, section 8.12 of revision 4.0. Add a pair of new ethtool messages that allow: * User space to trigger firmware update of transceiver modules * The kernel to notify user space about the progress of the process The user interface is designed to be asynchronous in order to avoid RTNL being held for too long and to allow several modules to be updated simultaneously. The interface is designed with CMIS compliant modules in mind, but kept generic enough to accommodate future use cases, if these arise. The kernel interface that will implement the firmware update using CDB command will include 2 layers that will be added under ethtool: * The upper layer that will be triggered from the module layer, is cmis_ fw_update. * The lower one is cmis_cdb. In the future there might be more operations to implement using CDB commands. Therefore, the idea is to keep the cmis_cdb interface clean and the cmis_fw_update specific to the cdb commands handling it. The communication between the kernel and the driver will be done using two ethtool operations that enable reading and writing the transceiver module EEPROM. The operation ethtool_ops::get_module_eeprom_by_page, that is already implemented, will be used for reading from the EEPROM the CDB reply, e.g. reading module setting, state, etc. The operation ethtool_ops::set_module_eeprom_by_page, that is added in the current patchset, will be used for writing to the EEPROM the CDB command such as start firmware image, run firmware image, etc. Therefore in order for a driver to implement module flashing, that driver needs to implement the two functions mentioned above. Patchset overview: Patch #1-#2: Implement the EEPROM writing in mlxsw. Patch #3: Define the interface between the kernel and user space. Patch #4: Add ability to notify the flashing firmware progress. Patch #5: Veto operations during flashing. Patch #6: Add extended compliance codes. Patch #7: Add the cdb layer. Patch #8: Add the fw_update layer. Patch #9: Add ability to flash transceiver modules' firmware. v8: Patch #7: * In the ethtool_cmis_wait_for_cond() evaluate the condition once more to decide if the error code should be -ETIMEDOUT or something else. * s/netdev_err/netdev_err_once. v7: Patch #4: * Return -ENOMEM instead of PTR_ERR(attr) on ethnl_module_fw_flash_ntf_put_err(). Patch #9: * Fix Warning for not unlocking the spin_lock in the error flow on module_flash_fw_work_list_add(). * Avoid the fall-through on ethnl_sock_priv_destroy(). v6: * Squash some of the last patch to patch #5 and patch #9. Patch #3: * Add paragraph in .rst file. Patch #4: * Reserve '1' more place on SKB for NUL terminator in the error message string. * Add more prints on error flow, re-write the printing function and add ethnl_module_fw_flash_ntf_put_err(). * Change the communication method so notification will be sent in unicast instead of multicast. * Add new 'struct ethnl_module_fw_flash_ntf_params' that holds the relevant info for unicast communication and use it to send notification to the specific socket. * s/nla_put_u64_64bit/nla_put_uint/ Patch #7: * In ethtool_cmis_cdb_init(), Use 'const' for the 'params' parameter. Patch #8: * Add a list field to struct ethtool_module_fw_flash for module_fw_flash_work_list that will be presented in the next patch. * Move ethtool_cmis_fw_update() cleaning to a new function that will be represented in the next patch. * Move some of the fields in struct ethtool_module_fw_flash to a separate struct, so ethtool_cmis_fw_update() will get only the relevant parameters for it. * Edit the relevant functions to get the relevant params for them. * s/CMIS_MODULE_READY_MAX_DURATION_USEC/CMIS_MODULE_READY_MAX_DURATION_MSEC Patch #9: * Add a paragraph in the commit message. * Rename labels in module_flash_fw_schedule(). * Add info to genl_sk_priv_*() and implement the relevant callbacks, in order to handle properly a scenario of closing the socket from user space before the work item was ended. * Add a list the holds all the ethtool_module_fw_flash struct that corresponds to the in progress work items. * Add a new enum for the socket types. * Use both above to identify a flashing socket, add it to the list and when closing socket affect only the flashing type. * Create a new function that will get the work item instead of ethtool_cmis_fw_update(). * Edit the relevant functions to get the relevant params for them. * The new function will call the old ethtool_cmis_fw_update(), and do the cleaning, so the existence of the list should be completely isolated in module.c. =================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28ethtool: Add ability to flash transceiver modules' firmwareDanielle Ratson
Add the ability to flash the modules' firmware by implementing the interface between the user space and the kernel. Example from a succeeding implementation: # ethtool --flash-module-firmware swp40 file test.bin Transceiver module firmware flashing started for device swp40 Transceiver module firmware flashing in progress for device swp40 Progress: 99% Transceiver module firmware flashing completed for device swp40 In addition, add infrastructure that allows modules to set socket-specific private data. This ensures that when a socket is closed from user space during the flashing process, the right socket halts sending notifications to user space until the work item is completed. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28ethtool: cmis_fw_update: add a layer for supporting firmware update using CDBDanielle Ratson
According to the CMIS standard, the firmware update process is done using a CDB commands sequence. Implement a work that will be triggered from the module layer in the next patch the will initiate and execute all the CDB commands in order, to eventually complete the firmware update process. This flashing process includes, writing the firmware image, running the new firmware image and committing it after testing, so that it will run upon reset. This work will also notify user space about the progress of the firmware update process. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28ethtool: cmis_cdb: Add a layer for supporting CDB commandsDanielle Ratson
CDB (Command Data Block Message Communication) reads and writes are performed on memory map pages 9Fh-AFh according to the CMIS standard, section 8.20 of revision 5.2. Page 9Fh is used to specify the CDB command to be executed and also provides an area for a local payload (LPL). According to the CMIS standard, the firmware update process is done using a CDB commands sequence that will be implemented in the next patch. The kernel interface that will implement the firmware update using CDB command will include 2 layers that will be added under ethtool: * The upper layer that will be triggered from the module layer, is cmis_fw_update. * The lower one is cmis_cdb. In the future there might be more operations to implement using CDB commands. Therefore, the idea is to keep the CDB interface clean and the cmis_fw_update specific to the CDB commands handling it. These two layers will communicate using the API the consists of three functions: - struct ethtool_cmis_cdb * ethtool_cmis_cdb_init(struct net_device *dev, struct ethtool_module_fw_flash_params *params); - void ethtool_cmis_cdb_fini(struct ethtool_cmis_cdb *cdb); - int ethtool_cmis_cdb_execute_cmd(struct net_device *dev, struct ethtool_cmis_cdb_cmd_args *args); Add the CDB layer to support initializing, finishing and executing CDB commands: * The initialization process will include creating of an ethtool_cmis_cdb instance, querying the module CDB support, entering and validating the password from user space (CMD 0x0000) and querying the module features (CMD 0x0040). * The finishing API will simply free the ethtool_cmis_cdb instance. * The executing process will write the CDB command to EEPROM using set_module_eeprom_by_page() that was presented earlier, and will process the reply from EEPROM. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28net: sfp: Add more extended compliance codesDanielle Ratson
SFF-8024 is used to define various constants re-used in several SFF SFP-related specifications. Add SFF-8024 extended compliance code definitions for CMIS compliant modules and use them in the next patch to determine the firmware flashing work. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28ethtool: Veto some operations during firmware flashing processDanielle Ratson
Some operations cannot be performed during the firmware flashing process. For example: - Port must be down during the whole flashing process to avoid packet loss while committing reset for example. - Writing to EEPROM interrupts the flashing process, so operations like ethtool dump, module reset, get and set power mode should be vetoed. - Split port firmware flashing should be vetoed. In order to veto those scenarios, add a flag in 'struct net_device' that indicates when a firmware flash is taking place on the module and use it to prevent interruptions during the process. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28ethtool: Add flashing transceiver modules' firmware notifications abilityDanielle Ratson
Add progress notifications ability to user space while flashing modules' firmware by implementing the interface between the user space and the kernel. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28ethtool: Add an interface for flashing transceiver modules' firmwareDanielle Ratson
CMIS compliant modules such as QSFP-DD might be running a firmware that can be updated in a vendor-neutral way by exchanging messages between the host and the module as described in section 7.3.1 of revision 5.2 of the CMIS standard. Add a pair of new ethtool messages that allow: * User space to trigger firmware update of transceiver modules * The kernel to notify user space about the progress of the process The user interface is designed to be asynchronous in order to avoid RTNL being held for too long and to allow several modules to be updated simultaneously. The interface is designed with CMIS compliant modules in mind, but kept generic enough to accommodate future use cases, if these arise. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28mlxsw: Implement ethtool operation to write to a transceiver module EEPROMIdo Schimmel
Implement the ethtool_ops::set_module_eeprom_by_page operation to allow ethtool to write to a transceiver module EEPROM, in a similar fashion to the ethtool_ops::get_module_eeprom_by_page operation. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28ethtool: Add ethtool operation to write to a transceiver module EEPROMIdo Schimmel
Ethtool can already retrieve information from a transceiver module EEPROM by invoking the ethtool_ops::get_module_eeprom_by_page operation. Add a corresponding operation that allows ethtool to write to a transceiver module EEPROM. The new write operation is purely an in-kernel API and is not exposed to user space. The purpose of this operation is not to enable arbitrary read / write access, but to allow the kernel to write to specific addresses as part of transceiver module firmware flashing. In the future, more functionality can be implemented on top of these read / write operations. Adjust the comments of the 'ethtool_module_eeprom' structure as it is no longer used only for read access. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-28drm/stm: dsi: expose DSI PHY internal clockRaphael Gallais-Pou
DSISRC __________ __\_ | \ pll4_p_ck ->| 1 |____dsi_k ck_dsi_phy ->| 0 | |____/ A DSI clock is missing in the clock framework. Looking at the clk_summary, it appears that 'ck_dsi_phy' is not implemented. Since the DSI kernel clock is based on the internal DSI pll. The common clock driver can not directly expose this 'ck_dsi_phy' clock because it does not contain any common registers with the DSI. Thus it needs to be done directly within the DSI phy driver. Signed-off-by: Raphael Gallais-Pou <raphael.gallais-pou@foss.st.com> Acked-by: Yannick Fertre <yannick.fertre@foss.st.com> Tested-by: Yannick Fertre <yannick.fertre@foss.st.com> Signed-off-by: Philippe Cornu <philippe.cornu@foss.st.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240129104106.43141-4-raphael.gallais-pou@foss.st.com
2024-06-28drm/stm: dsi: add pm runtime opsYannick Fertre
Update control of clocks and supply thanks to the PM runtime mechanism to avoid kernel crash during a system suspend. Signed-off-by: Yannick Fertre <yannick.fertre@foss.st.com> Signed-off-by: Raphael Gallais-Pou <raphael.gallais-pou@foss.st.com> Acked-by: Yannick Fertre <yannick.fertre@foss.st.com> Tested-by: Yannick Fertre <yannick.fertre@foss.st.com> Signed-off-by: Philippe Cornu <philippe.cornu@foss.st.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240129104106.43141-3-raphael.gallais-pou@foss.st.com
2024-06-28drm/stm: dsi: use new SYSTEM_SLEEP_PM_OPS() macroRaphael Gallais-Pou
Use RUNTIME_PM_OPS() instead of the old SET_SYSTEM_SLEEP_PM_OPS(). This means we don't need __maybe_unused on the functions. Signed-off-by: Raphael Gallais-Pou <raphael.gallais-pou@foss.st.com> Acked-by: Yannick Fertre <yannick.fertre@foss.st.com> Tested-by: Yannick Fertre <yannick.fertre@foss.st.com> Signed-off-by: Philippe Cornu <philippe.cornu@foss.st.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240129104106.43141-2-raphael.gallais-pou@foss.st.com