summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2024-09-10i2c: designware: Consolidate firmware parsing and configuring codeAndy Shevchenko
We have the same code flows in the PCI and platform drivers. Moreover, the flow requires the common code to export a few functions. Instead, consolidate that flow under new function called i2c_dw_fw_parse_and_configure() and drop unneeded exports. Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-09-10i2c: designware: Rename dw_i2c_of_configure() -> i2c_dw_of_configure()Andy Shevchenko
For the sake of consistency, rename dw_i2c_of_configure() and change its parameter to be aligned with the i2c_dw_acpi_configure(). Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-09-10i2c: riic: Add support for fast mode plusClaudiu Beznea
Fast mode plus is available on most of the IP variants that RIIC driver is working with. The exception is (according to HW manuals of the SoCs where this IP is available) the Renesas RZ/A1H. For this, patch introduces the struct riic_of_data::fast_mode_plus. Fast mode plus was tested on RZ/G3S, RZ/G2{L,UL,LC}, RZ/Five by instantiating the RIIC frequency to 1MHz and issuing i2c reads on the fast mode plus capable devices (and the i2c clock frequency was checked on RZ/G3S). Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-09-10i2c: riic: Define individual arrays to describe the register offsetsClaudiu Beznea
Define individual arrays to describe the register offsets. In this way we can describe different IP variants that share the same register offsets but have differences in other characteristics. Commit prepares for the addition of fast mode plus. Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-09-10i2c: riic: Add suspend/resume supportClaudiu Beznea
Add suspend/resume support for the RIIC driver. This is necessary for the Renesas RZ/G3S SoC which support suspend to deep sleep state where power to most of the SoC components is turned off. As a result the I2C controller needs to be reconfigured after suspend/resume. For this, the reset line was stored in the driver private data structure as well as i2c timings. The reset line and I2C timings are necessary to re-initialize the controller after resume. Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-09-10i2c: riic: Enable runtime PM autosuspend supportClaudiu Beznea
Enable runtime PM autosuspend support for the RIIC driver. With this, in case there are consecutive xfer requests the device wouldn't be runtime enabled/disabled after each consecutive xfer but after the the delay configured by user. With this, we can avoid touching hardware registers involved in runtime PM suspend/resume saving in this way some cycles. The default chosen autosuspend delay is zero to keep the previous driver behavior. Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-09-10i2c: riic: Use pm_runtime_resume_and_get()Claudiu Beznea
pm_runtime_get_sync() may return with error. In case it returns with error dev->power.usage_count needs to be decremented. pm_runtime_resume_and_get() takes care of this. Thus use it. Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-09-10i2c: riic: Call pm_runtime_get_sync() when need to access registersClaudiu Beznea
There is no need to runtime resume the device as long as the IP registers are not accessed. Calling pm_runtime_get_sync() at the register access time leads to a simpler error path. Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-09-10i2c: riic: Use temporary variable for struct deviceClaudiu Beznea
Use a temporary variable for the struct device pointers to avoid dereferencing. Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-09-10i2c: designware: Drop return value from dw_i2c_of_configure()Andy Shevchenko
dw_i2c_of_configure() is called without checking of the returned value, hence just drop it by converting to void. Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-09-10i2c: designware: Drop return value from i2c_dw_acpi_configure()Andy Shevchenko
i2c_dw_acpi_configure() is called without checking of the returned value, hence just drop it by converting to void. Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-09-10i2c: designware: Always provide device ID tablesAndy Shevchenko
There is no need to have ugly ifdeffery and additional macros for the device ID tables. Always provide them. Since we touch the ACPI table, make it sorted by ID. Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-09-10i2c: designware: Unify terminator in device ID tablesAndy Shevchenko
Make the terminator entry look the same in all device ID tables. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-09-10i2c: designware: Add missing 'c' into PCI IDs variable nameAndy Shevchenko
Add missing 'c' into i2c_designware_pci_ids variable name. Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-09-10i2c: designware: Let PCI core to take care about interrupt vectorsAndy Shevchenko
PCI core, after pcim_enable_device(), takes care about the allocated IRQ vectors, no need to do it explicitly and break the cleaning up order. Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-09-10i2c: designware: Replace a while-loop by for-loopAndy Shevchenko
Replace a while-loop by for-loop in i2c_dw_probe_lock_support() to save a few lines of code. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-09-10i2c: qcom-geni: Use goto for clearer exit pathAndi Shyti
Refactor the code by using goto statements to reduce duplication and make the exit path clearer. Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-09-10i2c: imx: Switch to RUNTIME_PM_OPS()Fabio Estevam
Replace SET_RUNTIME_PM_OPS() with its modern RUNTIME_PM_OPS() alternative. The combined usage of pm_ptr() and RUNTIME_PM_OPS() allows the compiler to evaluate if the runtime suspend/resume() functions are used at build time or are simply dead code. This allows removing the __maybe_unused notation from the runtime suspend/resume() functions. Signed-off-by: Fabio Estevam <festevam@denx.de> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-09-10i2c: mt65xx: Avoid double initialization of restart_flag in isrAngeloGioacchino Del Regno
In the mtk_i2c_irq() handler, variable restart_flag is initialized to zero and then reassigned with I2C_RS_TRANSFER if and only if auto_restart is enabled. Avoid a double initialization of this variable by transferring the auto_restart check to the restart_flag declaration. This commit brings no functional changes. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-09-10i2c: don't use ',' after delimitersWolfram Sang
Delimiters are meant to be last, no need for a ',' there. Remove a superfluous newline in the ali1535 driver while here. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Andrew Jeffery <andrew@codeconstruct.com.au> Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-09-10i2c: designware: Fix wrong setting for {ss,fs,hs}_{h,l}cnt registersAdrian Huang
When disabling CONFIG_X86_AMD_PLATFORM_DEVICE option, the driver 'drivers/acpi/acpi_apd.c' won't be compiled. This leads to a situation where BMC (Baseboard Management Controller) cannot retrieve the memory temperature via the i2c interface after i2c DW driver is loaded. Note that BMC can retrieve the memory temperature before booting into OS. [Debugging Detail] 1. dev->pclk and dev->clk are NULL when calling devm_clk_get_optional() in dw_i2c_plat_probe(). 2. The callings of i2c_dw_scl_hcnt() in i2c_dw_set_timings_master() return 65528 (-8 in integer format) or 65533 (-3 in integer format). The following log shows SS's HCNT/LCNT: i2c_designware AMDI0010:01: Standard Mode HCNT:LCNT = 65533:65535 3. The callings of i2c_dw_scl_lcnt() in i2c_dw_set_timings_master() return 65535 (-1 in integer format). The following log shows SS's HCNT/LCNT: i2c_designware AMDI0010:01: Fast Mode HCNT:LCNT = 65533:65535 4. i2c_dw_init_master() configures the register IC_SS_SCL_HCNT with the value 65533. However, the DW i2c databook mentioned the value cannot be higher than 65525. Quote from the DW i2c databook: NOTE: This register must not be programmed to a value higher than 65525, because DW_apb_i2c uses a 16-bit counter to flag an I2C bus idle condition when this counter reaches a value of IC_SS_SCL_HCNT + 10. 5. Since ss_hcnt, ss_lcnt, fs_hcnt, and fs_lcnt are the invalid values, we should not write the corresponding registers. Fix the issue by reading dev->{ss,fs,hs}_hcnt and dev->{ss,fs,hs}_lcnt from HW registers if ic_clk is not set. Reported-by: Dong Wang <wangdong28@lenovo.com> Suggested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Adrian Huang <ahuang12@lenovo.com> Tested-by: Dong Wang <wangdong28@lenovo.com> Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org> Link: https://lore.kernel.org/linux-i2c/8295cbe1-a7c5-4a35-a189-5d0bff51ede6@linux.intel.com/
2024-09-09clk: rockchip: remove unused mclk_pdm0_p/pdm0_p definitionsArnd Bergmann
When -Wunused-const-variable is enabled (not the default), there is a warning about two definitions in this file: In file included from drivers/clk/rockchip/clk-rk3576.c:14: drivers/clk/rockchip/clk-rk3576.c:334:7: error: 'mclk_pdm0_p' defined but not used [-Werror=unused-const-variable=] 334 | PNAME(mclk_pdm0_p) = { "mclk_pdm0_src_top", "xin24m" }; | ^~~~~~~~~~~ drivers/clk/rockchip/clk.h:564:43: note: in definition of macro 'PNAME' 564 | #define PNAME(x) static const char *const x[] __initconst | ^ drivers/clk/rockchip/clk-rk3576.c:333:7: error: 'pdm0_p' defined but not used [-Werror=unused-const-variable=] 333 | PNAME(pdm0_p) = { "clk_pdm0_src_top", "xin24m" }; | ^~~~~~ drivers/clk/rockchip/clk.h:564:43: note: in definition of macro 'PNAME' 564 | #define PNAME(x) static const char *const x[] __initconst | ^ Remove them for the moment. If they are needed later, they can be added back at that point. Fixes: cc40f5baa91b ("clk: rockchip: Add clock controller for the RK3576") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20240909121116.254036-1-arnd@kernel.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2024-09-09clk: qcom: clk-alpha-pll: Simplify the zonda_pll_adjust_l_val()Satya Priya Kakitapalli
In zonda_pll_adjust_l_val() replace the divide operator with comparison operator to fix below build error and smatch warning. drivers/clk/qcom/clk-alpha-pll.o: In function `clk_zonda_pll_set_rate': clk-alpha-pll.c:(.text+0x45dc): undefined reference to `__aeabi_uldivmod' smatch warnings: drivers/clk/qcom/clk-alpha-pll.c:2129 zonda_pll_adjust_l_val() warn: replace divide condition '(remainder * 2) / prate' with '(remainder * 2) >= prate' Fixes: f4973130d255 ("clk: qcom: clk-alpha-pll: Update set_rate for Zonda PLL") Reported-by: Jon Hunter <jonathanh@nvidia.com> Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/202408110724.8pqbpDiD-lkp@intel.com/ Signed-off-by: Satya Priya Kakitapalli <quic_skakitap@quicinc.com> Link: https://lore.kernel.org/r/20240906113905.641336-1-quic_skakitap@quicinc.com Reviewed-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2024-09-09idpf: enable WB_ON_ITRJoshua Hay
Tell hardware to write back completed descriptors even when interrupts are disabled. Otherwise, descriptors might not be written back until the hardware can flush a full cacheline of descriptors. This can cause unnecessary delays when traffic is light (or even trigger Tx queue timeout). The example scenario to reproduce the Tx timeout if the fix is not applied: - configure at least 2 Tx queues to be assigned to the same q_vector, - generate a huge Tx traffic on the first Tx queue - try to send a few packets using the second Tx queue. In such a case Tx timeout will appear on the second Tx queue because no completion descriptors are written back for that queue while interrupts are disabled due to NAPI polling. Fixes: c2d548cad150 ("idpf: add TX splitq napi poll support") Fixes: a5ab9ee0df0b ("idpf: add singleq start_xmit and napi poll") Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Co-developed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-09-09idpf: fix netdev Tx queue stop/wakeMichal Kubiak
netif_txq_maybe_stop() returns -1, 0, or 1, while idpf_tx_maybe_stop_common() says it returns 0 or -EBUSY. As a result, there sometimes are Tx queue timeout warnings despite that the queue is empty or there is at least enough space to restart it. Make idpf_tx_maybe_stop_common() inline and returning true or false, handling the return of netif_txq_maybe_stop() properly. Use a correct goto in idpf_tx_maybe_stop_splitq() to avoid stopping the queue or incrementing the stops counter twice. Fixes: 6818c4d5b3c2 ("idpf: add splitq start_xmit") Fixes: a5ab9ee0df0b ("idpf: add singleq start_xmit and napi poll") Cc: stable@vger.kernel.org # 6.7+ Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-09-09idpf: refactor Tx completion routinesJoshua Hay
Add a mechanism to guard against stashing partial packets into the hash table to make the driver more robust, with more efficient decision making when cleaning. Don't stash partial packets. This can happen when an RE (Report Event) completion is received in flow scheduling mode, or when an out of order RS (Report Status) completion is received. The first buffer with the skb is stashed, but some or all of its frags are not because the stack is out of reserve buffers. This leaves the ring in a weird state since the frags are still on the ring. Use the field libeth_sqe::nr_frags to track the number of fragments/tx_bufs representing the packet. The clean routines check to make sure there are enough reserve buffers on the stack before stashing any part of the packet. If there are not, next_to_clean is left pointing to the first buffer of the packet that failed to be stashed. This leaves the whole packet on the ring, and the next time around, cleaning will start from this packet. An RS completion is still expected for this packet in either case. So instead of being cleaned from the hash table, it will be cleaned from the ring directly. This should all still be fine since the DESC_UNUSED and BUFS_UNUSED will reflect the state of the ring. If we ever fall below the thresholds, the TxQ will still be stopped, giving the completion queue time to catch up. This may lead to stopping the queue more frequently, but it guarantees the Tx ring will always be in a good state. Also, always use the idpf_tx_splitq_clean function to clean descriptors, i.e. use it from clean_buf_ring as well. This way we avoid duplicating the logic and make sure we're using the same reserve buffers guard rail. This does require a switch from the s16 next_to_clean overflow descriptor ring wrap calculation to u16 and the normal ring size check. Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-09-09idpf: convert to libeth Tx buffer completionAlexander Lobakin
&idpf_tx_buffer is almost identical to the previous generations, as well as the way it's handled. Moreover, relying on dma_unmap_addr() and !!buf->skb instead of explicit defining of buffer's type was never good. Use the newly added libeth helpers to do it properly and reduce the copy-paste around the Tx code. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-09-09regulator: tps6287x: Constify struct regulator_descChristophe JAILLET
'struct regulator_desc' is not modified in this driver. Constifying this structure moves some data to a read-only section, so increases overall security, especially when the structure holds some function pointers. On a x86_64, with allmodconfig: Before: ====== text data bss dec hex filename 4974 736 16 5726 165e drivers/regulator/tps6287x-regulator.o After: ===== text data bss dec hex filename 5294 416 16 5726 165e drivers/regulator/tps6287x-regulator.o -- Compile tested only Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://patch.msgid.link/7727e493490d37775a653905dfe0cc1d8478f8e0.1725908163.git.christophe.jaillet@wanadoo.fr Signed-off-by: Mark Brown <broonie@kernel.org>
2024-09-09regulator: wm8400: Constify struct regulator_descChristophe JAILLET
'struct regulator_desc' is not modified in this driver. Constifying this structure moves some data to a read-only section, so increases overall security, especially when the structure holds some function pointers. On a x86_64, with allmodconfig: Before: ====== text data bss dec hex filename 4419 2512 0 6931 1b13 drivers/regulator/wm8400-regulator.o After: ===== text data bss dec hex filename 6307 624 0 6931 1b13 drivers/regulator/wm8400-regulator.o -- Compile tested only Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://patch.msgid.link/fde33ecfd9bbdbdc1da1620c9f3b1b7a72f9d805.1725906876.git.christophe.jaillet@wanadoo.fr Signed-off-by: Mark Brown <broonie@kernel.org>
2024-09-09net/mlx5: Fix bridge mode operations when there are no VFsBenjamin Poirier
Currently, trying to set the bridge mode attribute when numvfs=0 leads to a crash: bridge link set dev eth2 hwmode vepa [ 168.967392] BUG: kernel NULL pointer dereference, address: 0000000000000030 [...] [ 168.969989] RIP: 0010:mlx5_add_flow_rules+0x1f/0x300 [mlx5_core] [...] [ 168.976037] Call Trace: [ 168.976188] <TASK> [ 168.978620] _mlx5_eswitch_set_vepa_locked+0x113/0x230 [mlx5_core] [ 168.979074] mlx5_eswitch_set_vepa+0x7f/0xa0 [mlx5_core] [ 168.979471] rtnl_bridge_setlink+0xe9/0x1f0 [ 168.979714] rtnetlink_rcv_msg+0x159/0x400 [ 168.980451] netlink_rcv_skb+0x54/0x100 [ 168.980675] netlink_unicast+0x241/0x360 [ 168.980918] netlink_sendmsg+0x1f6/0x430 [ 168.981162] ____sys_sendmsg+0x3bb/0x3f0 [ 168.982155] ___sys_sendmsg+0x88/0xd0 [ 168.985036] __sys_sendmsg+0x59/0xa0 [ 168.985477] do_syscall_64+0x79/0x150 [ 168.987273] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 168.987773] RIP: 0033:0x7f8f7950f917 (esw->fdb_table.legacy.vepa_fdb is null) The bridge mode is only relevant when there are multiple functions per port. Therefore, prevent setting and getting this setting when there are no VFs. Note that after this change, there are no settings to change on the PF interface using `bridge link` when there are no VFs, so the interface no longer appears in the `bridge link` output. Fixes: 4b89251de024 ("net/mlx5: Support ndo bridge_setlink and getlink") Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5: Verify support for scheduling element and TSAR typeCarolina Jubran
Before creating a scheduling element in a NIC or E-Switch scheduler, ensure that the requested element type is supported. If the element is of type Transmit Scheduling Arbiter (TSAR), also verify that the specific TSAR type is supported. Fixes: 214baf22870c ("net/mlx5e: Support HTB offload") Fixes: 85c5f7c9200e ("net/mlx5: E-switch, Create QoS on demand") Fixes: 0fe132eac38c ("net/mlx5: E-switch, Allow to add vports to rate groups") Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5: Explicitly set scheduling element and TSAR typeCarolina Jubran
Ensure the scheduling element type and TSAR type are explicitly initialized in the QoS rate group creation. This prevents potential issues due to default values. Fixes: 1ae258f8b343 ("net/mlx5: E-switch, Introduce rate limiting groups API") Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5e: Add missing link mode to ptys2ext_ethtool_mapShahar Shitrit
Add MLX5E_400GAUI_8_400GBASE_CR8 to the extended modes in ptys2ext_ethtool_table, since it was missing. Fixes: 6a897372417e ("net/mlx5: ethtool, Add ethtool support for 50Gbps per lane link modes") Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5e: Add missing link modes to ptys2ethtool_mapShahar Shitrit
Add MLX5E_1000BASE_T and MLX5E_100BASE_TX to the legacy modes in ptys2legacy_ethtool_table, since they were missing. Fixes: 665bc53969d7 ("net/mlx5e: Use new ethtool get/set link ksettings API") Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5: Update the list of the PCI supported devicesMaher Sanalla
Add the upcoming ConnectX-9 device ID to the table of supported PCI device IDs. Fixes: f908a35b2218 ("net/mlx5: Update the list of the PCI supported devices") Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09ieee802154: Fix build errorJinjie Ruan
If REGMAP_SPI is m and IEEE802154_MCR20A is y, mcr20a.c:(.text+0x3ed6c5b): undefined reference to `__devm_regmap_init_spi' ld: mcr20a.c:(.text+0x3ed6cb5): undefined reference to `__devm_regmap_init_spi' Select REGMAP_SPI for IEEE802154_MCR20A to fix it. Fixes: 8c6ad9cc5157 ("ieee802154: Add NXP MCR20A IEEE 802.15.4 transceiver driver") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://lore.kernel.org/20240909131740.1296608-1-ruanjinjie@huawei.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2024-09-09cxl/region: Remove lock from memory notifier callbackIra Weiny
In testing Dynamic Capacity Device (DCD) support, a lockdep splat revealed an ABBA issue between the memory notifiers and the DCD extent processing code.[0] Changing the lock ordering within DCD proved difficult because regions must be stable while searching for the proper region and then the device lock must be held to properly notify the DAX region driver of memory changes. Dan points out in the thread that notifiers should be able to trust that it is safe to access static data. Region data is static once the device is realized and until it's destruction. Thus it is better to manage the notifiers within the region driver. Remove the need for a lock by ensuring the notifiers are active only during the region's lifetime. Furthermore, remove cxl_region_nid() because resource can't be NULL while the region is stable. Link: https://lore.kernel.org/all/66b4cf539a79b_a36e829416@iweiny-mobl.notmuch/ [0] Cc: Ying Huang <ying.huang@intel.com> Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ying Huang <ying.huang@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://patch.msgid.link/20240904-fix-notifiers-v3-1-576b4e950266@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-09cxl/pci: simplify the check of mem_enabled in cxl_hdm_decode_init()Yanfei Xu
Cases can be divided into two categories which are DVSEC range enabled and not enabled when HDM decoders exist but is not enabled. To avoid checking info->mem_enabled, which indicates the enablement of DVSEC range, every time, we can check !info->mem_enabled once in advance. This simplification can make the code clearer. No functional change intended. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yanfei Xu <yanfei.xu@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20240828084231.1378789-5-yanfei.xu@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-09cxl/pci: Check Mem_info_valid bit for each applicable DVSECYanfei Xu
In theory a device might set the mem_info_valid bit for a first range after it is ready but before as second range has reached that state. Therefore, the correct approach is to check the Mem_info_valid bit for each applicable DVSEC range against HDM_COUNT, rather than only for the DVSEC range 1. Consequently, let's move the check into the "for loop" that handles each DVSEC range. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yanfei Xu <yanfei.xu@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20240828084231.1378789-4-yanfei.xu@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-09cxl/pci: Remove duplicated implementation of waiting for memory_info_validYanfei Xu
commit ce17ad0d5498 ("cxl: Wait Memory_Info_Valid before access memory related info") added another implementation, which is cxl_dvsec_mem_range_valid(), of waiting for memory_info_valid without realizing it duplicated wait_for_valid(). Remove wait_for_valid() and retain cxl_dvsec_mem_range_valid() as the former is hardcoded to check only the Memory_Info_Valid bit of DVSEC range 1, while the latter allows for selection between DVSEC range 1 or 2 via parameter. Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yanfei Xu <yanfei.xu@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20240828084231.1378789-3-yanfei.xu@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-09cxl/pci: Fix to record only non-zero rangesYanfei Xu
The function cxl_dvsec_rr_decode() retrieves and records DVSEC ranges into info->dvsec_range[], regardless of whether it is non-zero range, and the variable info->ranges indicates the number of non-zero ranges. However, in cxl_hdm_decode_init(), the validation for info->dvsec_range[] occurs in a for loop that iterates based on info->ranges. It may result in zero range to be validated but non-zero range not be validated, in turn, the number of allowed ranges is to be 0. Address it by only record non-zero ranges. This fix is not urgent as it requires a configuration that zeroes out the first dvsec range while populating the second. This has not been observed, but it is theoretically possible. If this gets picked up for -stable, no harm done, but there is no urgency to backport. Fixes: 560f78559006 ("cxl/pci: Retrieve CXL DVSEC memory info") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Yanfei Xu <yanfei.xu@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20240828084231.1378789-2-yanfei.xu@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-09RDMA/bnxt_re: Fix the max WQE size for static WQE supportSelvin Xavier
When variable size WQE is supported, max_qp_sges reported is more than 6. For devices that supports variable size WQE, the Send WQE size calculation is wrong when an an older library that doesn't support variable size WQE is used. Set the WQE size to 128 when static WQE is supported. Fixes: de1d364c3815 ("RDMA/bnxt_re: Add support for Variable WQE in Genp7 adapters") Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1725444253-13221-3-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-09-09RDMA/bnxt_re: Fix the compatibility flag for variable size WQESelvin Xavier
For older adapters that doesn't support variable size WQE, driver is wrongly reporting that variable WQE is supported, when the latest library is used. Report the variable WQE capability only if the driver supports it. Fixes: 10a104c0debb ("RDMA/bnxt_re: Enable variable size WQEs for user space applications") Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1725444253-13221-2-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-09-09RDMA/mlx5: Fix MR cache temp entries cleanupMichael Guralnik
Fix the cleanup of the temp cache entries that are dynamically created in the MR cache. The cleanup of the temp cache entries is currently scheduled only when a new entry is created. Since in the cleanup of the entries only the mkeys are destroyed and the cache entry stays in the cache, subsequent registrations might reuse the entry and it will eventually be filled with new mkeys without cleanup ever getting scheduled again. On workloads that register and deregister MRs with a wide range of properties we see the cache ends up holding many cache entries, each holding the max number of mkeys that were ever used through it. Additionally, as the cleanup work is scheduled to run over the whole cache, any mkey that is returned to the cache after the cleanup was scheduled will be held for less than the intended 30 seconds timeout. Solve both issues by dropping the existing remove_ent_work and reusing the existing per-entry work to also handle the temp entries cleanup. Schedule the work to run with a 30 seconds delay every time we push an mkey to a clean temp entry. This ensures the cleanup runs on each entry only 30 seconds after the first mkey was pushed to an empty entry. As we have already been distinguishing between persistent and temp entries when scheduling the cache_work_func, it is not being scheduled in any other flows for the temp entries. Another benefit from moving to a per-entry cleanup is we now not required to hold the rb_tree mutex, thus enabling other flow to run concurrently. Fixes: dd1b913fb0d0 ("RDMA/mlx5: Cache all user cacheable mkeys on dereg MR flow") Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Link: https://patch.msgid.link/e4fa4bb03bebf20dceae320f26816cd2dde23a26.1725362530.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-09-09RDMA/mlx5: Limit usage of over-sized mkeys from the MR cacheMichael Guralnik
When searching the MR cache for suitable cache entries, don't use mkeys larger than twice the size required for the MR. This should ensure the usage of mkeys closer to the minimal required size and reduce memory waste. On driver init we create entries for mkeys with clear attributes and powers of 2 sizes from 4 to the max supported size. This solves the issue for anyone using mkeys that fit these requirements. In the use case where an MR is registered with different attributes, like an access flag we can't UMR, we'll create a new cache entry to store it upon dereg. Without this fix, any later registration with same attributes and smaller size will use the newly created cache entry and it's mkeys, disregarding the memory waste of using mkeys larger than required. For example, one worst-case scenario can be when registering and deregistering a 1GB mkey with ATS enabled which will cause the creation of a new cache entry to hold those type of mkeys. A user registering a 4k MR with ATS will end up using the new cache entry and an mkey that can support a 1GB MR, thus wasting x250k memory than actually needed in the HW. Additionally, allow all small registration to use the smallest size cache entry that is initialized on driver load even if size is larger than twice the required size. Fixes: 73d09b2fe833 ("RDMA/mlx5: Introduce mlx5r_cache_rb_key") Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Link: https://patch.msgid.link/8ba3a6e3748aace2026de8b83da03aba084f78f4.1725362530.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-09-09RDMA/mlx5: Fix counter update on MR cache mkey creationMichael Guralnik
After an mkey is created, update the counter for pending mkeys before reshceduling the work that is filling the cache. Rescheduling the work with a full MR cache entry and a wrong 'pending' counter will cause us to miss disabling the fill_to_high_water flag. Thus leaving the cache full but with an indication that it's still needs to be filled up to it's full size (2 * limit). Next time an mkey will be taken from the cache, we'll unnecessarily continue the process of filling the cache to it's full size. Fixes: 57e7071683ef ("RDMA/mlx5: Implement mkeys management via LIFO queue") Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Link: https://patch.msgid.link/0f44f462ba22e45f72cb3d0ec6a748634086b8d0.1725362530.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-09-09RDMA/mlx5: Drop redundant work canceling from clean_keys()Michael Guralnik
The canceling of dealyed work in clean_keys() is a leftover from years back and was added to prevent races in the cleanup process of MR cache. The cleanup process was rewritten a few years ago and the canceling of delayed work and flushing of workqueue was added before the call to clean_keys(). Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Link: https://patch.msgid.link/943d21f5a9dba7b98a3e1d531e3561ffe9745d71.1725362530.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-09-09RDMA/erdma: Return QP state in erdma_query_qpCheng Xu
Fix qp_state and cur_qp_state to return correct values in struct ib_qp_attr. Fixes: 155055771704 ("RDMA/erdma: Add verbs implementation") Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com> Link: https://patch.msgid.link/20240902112920.58749-4-chengyou@linux.alibaba.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-09-09RDMA/erdma: Add disassociate ucontext supportCheng Xu
All IO pages mapped to user space are handled by rdma_user_mmap_io, so add empty stub for disassociate ucontext. Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com> Link: https://patch.msgid.link/20240902112920.58749-3-chengyou@linux.alibaba.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-09-09RDMA/erdma: Refactor the initialization and destruction of EQCheng Xu
We extracted the common parts of the initialization/destruction process to make the code cleaner. Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com> Link: https://patch.msgid.link/20240902112920.58749-2-chengyou@linux.alibaba.com Signed-off-by: Leon Romanovsky <leon@kernel.org>