summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-10-25phy: phy_ethtool_ksettings_set: Move after phy_start_anegAndrew Lunn
This allows it to make use of a helper which assume the PHY is already locked. Fixes: 2d55173e71b0 ("phy: add generic function to support ksetting support") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25phy: phy_ethtool_ksettings_get: Lock the phy for consistencyAndrew Lunn
The PHY structure should be locked while copying information out if it, otherwise there is no guarantee of self consistency. Without the lock the PHY state machine could be updating the structure. Fixes: 2d55173e71b0 ("phy: add generic function to support ksetting support") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25ath10k: fix module load regression with iram-recovery featureAbinaya Kalaiselvan
Commit 9af7c32ceca8 ("ath10k: add target IRAM recovery feature support") introduced a new firmware feature flag ATH10K_FW_FEATURE_IRAM_RECOVERY. But this caused ath10k_pci module load to fail if ATH10K_FW_CRASH_DUMP_RAM_DATA bit was not enabled in the ath10k coredump_mask module parameter: [ 2209.328190] ath10k_pci 0000:02:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe [ 2209.434414] ath10k_pci 0000:02:00.0: kconfig debug 1 debugfs 1 tracing 1 dfs 1 testmode 1 [ 2209.547191] ath10k_pci 0000:02:00.0: firmware ver 10.4-3.9.0.2-00099 api 5 features no-p2p,mfp,peer-flow-ctrl,btcoex-param,allows-mesh-bcast,no-ps,peer-fixed-rate,iram-recovery crc32 cbade90a [ 2210.896485] ath10k_pci 0000:02:00.0: board_file api 1 bmi_id 0:1 crc32 a040efc2 [ 2213.603339] ath10k_pci 0000:02:00.0: failed to copy target iram contents: -12 [ 2213.839027] ath10k_pci 0000:02:00.0: could not init core (-12) [ 2213.933910] ath10k_pci 0000:02:00.0: could not probe fw (-12) And by default coredump_mask does not have ATH10K_FW_CRASH_DUMP_RAM_DATA enabled so anyone using a firmware with iram-recovery feature would fail. To my knowledge only QCA9984 firmwares starting from release 10.4-3.9.0.2-00099 enabled the feature. The reason for regression was that ath10k_core_copy_target_iram() used ath10k_coredump_get_mem_layout() to get the memory layout, but when ATH10K_FW_CRASH_DUMP_RAM_DATA was disabled it would get just NULL and bail out with an error. While looking at all this I noticed another bug: if CONFIG_DEV_COREDUMP is disabled but the firmware has iram-recovery enabled the module load fails with similar error messages. I fixed that by returning 0 from ath10k_core_copy_target_iram() when _ath10k_coredump_get_mem_layout() returns NULL. Tested-on: QCA9984 hw2.0 PCI 10.4-3.9.0.2-00139 Fixes: 9af7c32ceca8 ("ath10k: add target IRAM recovery feature support") Signed-off-by: Abinaya Kalaiselvan <akalaise@codeaurora.org> Signed-off-by: Jouni Malinen <jouni@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211020075054.23061-1-kvalo@codeaurora.org
2021-10-25Merge branch 'qca8081-phy-driver'David S. Miller
Luo Jie says: ==================== net: phy: Add qca8081 ethernet phy driver This patch series add the qca8081 ethernet phy driver support, which improve the wol feature, leverage at803x phy driver and add the fast retrain, master/slave seed and CDT feature. Changes in v7: * update Reviewed-by tags. Changes in v6: * add Reviewed-by tags on the applicable patches. Changes in v5: * rebase the patches on net-next/master. Changes in v4: * handle other interrupts in set_wol. * add genphy_c45_fast_retrain. Changes in v3: * correct a typo "excpet". * remove the suffix "PHY" from phy name. Changes in v2: * add definitions of fast retrain related registers in mdio.h. * break up the patch into small patches. * improve the at803x legacy code. Changes in v1: * merge qca8081 phy driver into at803x. * add cdt feature. * leverage at803x phy driver helpers. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: phy: add qca8081 cdt featureLuo Jie
To perform CDT of qca8081 phy: 1. disable hibernation. 2. force phy working in MDI mode. 3. force phy working in 1000BASE-T mode. 4. configure the related thresholds. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: phy: adjust qca8081 master/slave seed value if link downLuo Jie
1. The master/slave seed needs to be updated when the link can't be created. 2. The case where two qca8081 PHYs are connected each other and master/slave seed is generated as the same value also needs to be considered, so adding this code change into read_status instead of link_change_notify. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: phy: add qca8081 soft_reset and enable master/slave seedLuo Jie
qca8081 phy is a single port phy, configure phy the lower seed value to make it linked as slave mode easier. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: phy: add qca8081 config_initLuo Jie
Add the qca8081 phy driver config_init function, which includes: 1. Enable fast restrain. 2. Add 802.3az configurations. 3. Initialize ADC threshold as 100mv. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: phy: add genphy_c45_fast_retrainLuo Jie
Add generic fast retrain auto-negotiation function for C45 PHYs. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: phy: add constants for fast retrain related registerLuo Jie
Add the constants for 2.5G fast retrain capability in 10G AN control register, fast retrain status and control register and THP bypass register into mdio.h. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: phy: add qca8081 config_anegLuo Jie
Reuse at803x phy driver config_aneg excepting adding 2500M auto-negotiation. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: phy: add qca8081 get_featuresLuo Jie
Reuse the at803x phy driver get_features excepting adding 2500M capability. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: phy: add qca8081 read_statusLuo Jie
1. Separate the function at803x_read_specific_status from the at803x_read_status, since it can be reused by the read_status of qca8081 phy driver excepting adding the 2500M speed. 2. Add the qca8081 read_status function qca808x_read_status. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: phy: add qca8081 ethernet phy driverLuo Jie
qca8081 is a single port ethernet phy chip that supports 10/100/1000/2500 Mbps mode. Add the basic phy driver features, and reuse the at803x phy driver functions. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: phy: at803x: use GENMASK() for speed statusLuo Jie
Use GENMASK() for the current speed value. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: phy: at803x: improve the WOL featureLuo Jie
The wol feature is controlled by the MMD3.8012 bit5, need to set this bit when the wol function is enabled. The reg18 bit0 is for enabling WOL interrupt, when wol occurs, the wol interrupt status reg19 bit0 is set to 1. Call phy_trigger_machine if there are any other interrupt pending in the function set_wol. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: phy: at803x: use phy_modify()Luo Jie
Convert at803x_set_wol to use phy_modify. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: phy: at803x: replace AT803X_DEVICE_ADDR with MDIO_MMD_PCSLuo Jie
Replace AT803X_DEVICE_ADDR with MDIO_MMD_PCS defined in mdio.h. Signed-off-by: Luo Jie <luoj@codeaurora.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25ath10k: fix invalid dma_addr_t token assignmentArnd Bergmann
Using a kernel pointer in place of a dma_addr_t token can lead to undefined behavior if that makes it into cache management functions. The compiler caught one such attempt in a cast: drivers/net/wireless/ath/ath10k/mac.c: In function 'ath10k_add_interface': drivers/net/wireless/ath/ath10k/mac.c:5586:47: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast] 5586 | arvif->beacon_paddr = (dma_addr_t)arvif->beacon_buf; | ^ Looking through how this gets used down the way, I'm fairly sure that beacon_paddr is never accessed again for ATH10K_DEV_TYPE_HL devices, and if it was accessed, that would be a bug. Change the assignment to use a known-invalid address token instead, which avoids the warning and makes it easier to catch bugs if it does end up getting used. Fixes: e263bdab9c0e ("ath10k: high latency fixes for beacon buffer") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211014075153.3655910-1-arnd@kernel.org
2021-10-25ath11k: change return buffer manager for QCA6390Baochen Qiang
QCA6390 firmware uses HAL_RX_BUF_RBM_SW1_BM, not HAL_RX_BUF_RBM_SW3_BM. This is needed to fix a case where an A-MSDU has an unexpected LLC/SNAP header in the first subframe (CVE-2020-24588). Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-01740-QCAHSTSWPLZ_V2_TO_X86-1 Signed-off-by: Baochen Qiang <bqiang@codeaurora.org> Signed-off-by: Jouni Malinen <jouni@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20210914163726.38604-2-jouni@codeaurora.org
2021-10-25Merge branch 'hns3-next'David S. Miller
Guangbin Huang says: ==================== net: hns3: updates for -next This series includes some updates for the HNS3 ethernet driver. for it. off. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: hns3: add error recovery module and type for himacJiaran Zhang
This patch adds himac error recovery module, link_error type and ptp_error type for himac. Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: hns3: add new ras error type for roceWeihang Li
This patch adds one ras error of bus related for roce, this error including RRESP/BRESP and read poison error. Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: hns3: add update ethtool advertised link modes for FIBRE port when ↵Guangbin Huang
autoneg off Currently, the ethtool advertised link modes of FIBRE port is cleared to zero when autoneg is off, so user can not get the advertised link modes info directly from "ethtool <dev>" command. In order to ameliorate this situation, update data of speeds, fec and pause of advertised link modes when autoneg is off. Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: hns3: modify functions of converting speed ability to ethtool link modeGuangbin Huang
The functions of converting speed ability to ethtool link mode just support setting mac->supported currently, to reuse these functions to set ethtool link mode for others(i.e. advertising), delete the argument mac and add argument link_mode. Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: hns3: add support pause/pfc durations for mac statisticsGuangbin Huang
The mac statistics add pause/pfc durations in device version V3, we can get total active cycle of pause/pfc from these durations. As driver gets register number from firmware to calculate desc number to query mac statistics, it needs to set mac statistics extended enable bit in firmware command 0x701A to tell firmware that driver supports extended mac statistics, otherwise firmware only returns register number of version V1. As pause/pfc durations are not supported by hardware of old version, they should not been shown in command "ethtool -S ethX" in this case, so add checking max register number of each mac statistic in their version. If the max register number of one mac statistic is greater than register number got from firmware, it means hardware does not support this mac statistic, so ignore this statistic when get string and data of mac statistic. Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: hns3: device specifications add number of mac statisticsGuangbin Huang
Currently, driver queries number of mac statistics before querying mac statistics. As the number of mac statistics is a fixed value in firmware, it is redundant to query this number everytime before querying mac statistics, it can just be queried once in initialization process and saved in device specifications. Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: hns3: modify mac statistics update process for compatibilityGuangbin Huang
After querying mac statistics from firmware, driver copies data from descriptors to struct mac_stats of hdev, and the number of copied data is just according to the register number queried from firmware. There is a problem that if the register number queried from firmware is larger than data number of struct mac_stats, it will cause a copy overflow. So if the firmware adds more mac statistics in later version, it is not compatible with driver of old version. To fix this problem, the number of copied data needs to be used the minimum value between the register number queried from firmware and data number of struct mac_stats. The first descriptor has three data and there is one reserved, to optimize the copy process, add this reserverd data to struct mac_stats. Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: hns3: add debugfs support for interrupt coalesceHuazhong Tan
Since user may need to check the current configuration of the interrupt coalesce, so add debugfs support for query this info. Create a single file "coalesce_info" for it, and query it by "cat coalesce_info", return the result to userspace. For device whose version is above V3(include V3), the GL's register contains usecs and 1us unit configuration. When get the usecs configuration from this register, it will include the confusing unit configuration, so add a GL mask to get the correct value, and add a QL mask for the frames configuration as well. The display style is below: $ cat coalesce_info tx interrupt coalesce info: VEC_ID ALGO_STATE PROFILE_ID CQE_MODE TUNE_STATE STEPS_LEFT... 0 IN_PROG 4 EQE ON_TOP 0... 1 START 3 EQE LEFT 1... rx interrupt coalesce info: VEC_ID ALGO_STATE PROFILE_ID CQE_MODE TUNE_STATE STEPS_LEFT... 0 IN_PROG 3 EQE LEFT 1... 1 IN_PROG 0 EQE ON_TOP 0... Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25Merge branch 's390-qeth-next'David S. Miller
Julian Wiedmann says: ==================== s390/qeth: updates 2021-10-25 please apply the following patch series for qeth to netdev's net-next tree. This brings some minor maintenance improvements, and a bunch of cleanups so that the W=1 build passes without warning. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25s390/qeth: update kerneldoc for qeth_add_hw_header()Julian Wiedmann
qeth_add_hw_header() is missing documentation for some of its parameters, fix that up. Reported-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25s390/qeth: fix kernel doc commentsHeiko Carstens
Fix kernel doc comments and remove incorrect kernel doc indicators. Acked-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25s390/qeth: add __printf format attribute to qeth_dbf_longtextHeiko Carstens
Allow the compiler to recognize and check format strings and parameters. As reported with allmodconfig and W=1: drivers/s390/net/qeth_core_main.c: In function ‘qeth_dbf_longtext’: drivers/s390/net/qeth_core_main.c:6190:9: error: function ‘qeth_dbf_longtext’ might be a candidate for ‘gnu_printf’ format attribute [-Werror=suggest-attribute=format] 6190 | vsnprintf(dbf_txt_buf, sizeof(dbf_txt_buf), fmt, args); | ^~~~~~~~~ Acked-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25s390/qeth: fix various format stringsHeiko Carstens
Various format strings don't match with types of parameters. Fix all of them. Acked-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25s390/qeth: don't keep track of Input Queue countJulian Wiedmann
The only actual user of qdio.no_input_queues is qeth_qdio_establish(), and there we already have full awareness of the current Input Queue configuration (1 RX queue, plus potentially 1 TX Completion queue). So avoid this state tracking, and the ambiguity it brings with it. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25s390/qeth: clarify remaining dev_kfree_skb_any() usersJulian Wiedmann
For none of the users we are under risk of running in HW IRQ context or or with IRQs disabled. Thus we always end up in consume_skb(). But the two occurences in the RX path should really report the dropped packet to dropmon, so have them use kfree_skb() instead. That's also consistent with what napi_free_frags() does internally. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25s390/qeth: move qdio's QAOB cache into qethJulian Wiedmann
qdio.ko no longer needs to care about how the QAOBs are allocated, from its perspective they are merely another parameter to do_QDIO(). So for a start, shift the cache into the only qdio driver that uses QAOBs (ie. qeth). Here there's further opportunity to optimize its usage in the future - eg. make it per-{device, TX queue}, or only compile it when the driver is built with CQ/QAOB support. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25s390/qeth: remove .do_ioctl() callback from driver disciplineJulian Wiedmann
With commit 18787eeebd71 ("qeth: use ndo_siocdevprivate") this callback is now actually used to handle transport mode-specific _private_ ioctls. We only have such ioctls for L3 devices. So wire up a L3-specific .ndo_siocdevprivate() callback that handles those ioctls, and defers to the core qeth_siocdevprivate() for all other private ioctls. This takes the discipline one step closer to its original purpose of providing an internal extension for the qeth_core_ccwgroup_driver. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25s390/qeth: improve trace entries for MAC address (un)registrationJulian Wiedmann
Add the failed MAC address into the trace message. Also fix up one format string to use %x instead of %u for the CARD_DEVID. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25ARM: 9148/1: handle CONFIG_CPU_ENDIAN_BE32 in arch/arm/kernel/head.SLABBE Corentin
My intel-ixp42x-welltech-epbx100 no longer boot since 4.14. This is due to commit 463dbba4d189 ("ARM: 9104/2: Fix Keystone 2 kernel mapping regression") which forgot to handle CONFIG_CPU_ENDIAN_BE32 as possible BE config. Suggested-by: Krzysztof Hałasa <khalasa@piap.pl> Fixes: 463dbba4d189 ("ARM: 9104/2: Fix Keystone 2 kernel mapping regression") Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2021-10-25Merge branch 'dsa-rtnl'David S. Miller
Vladimir Oltean says: ==================== Drop rtnl_lock from DSA .port_fdb_{add,del} As mentioned in the RFC posted 2 months ago: https://patchwork.kernel.org/project/netdevbpf/cover/20210824114049.3814660-1-vladimir.oltean@nxp.com/ DSA is transitioning to a driver API where the rtnl_lock is not held when calling ds->ops->port_fdb_add() and ds->ops->port_fdb_del(). Drivers cannot take that lock privately from those callbacks either. This change is required so that DSA can wait for switchdev FDB work items to finish before leaving the bridge. That change will be made in a future patch series. A small selftest is provided with the patch set in the hope that concurrency issues uncovered by this series, but not spotted by me by code inspection, will be caught. A status of the existing drivers: - mv88e6xxx_port_fdb_add() and mv88e6xxx_port_fdb_del() take mv88e6xxx_reg_lock() so they should be safe. - qca8k_fdb_add() and qca8k_fdb_del() take mutex_lock(&priv->reg_mutex) so they should be safe. - hellcreek_fdb_add() and hellcreek_fdb_add() take mutex_lock(&hellcreek->reg_lock) so they should be safe. - ksz9477_port_fdb_add() and ksz9477_port_fdb_del() take mutex_lock(&dev->alu_mutex) so they should be safe. - b53_fdb_add() and b53_fdb_del() did not have locking, so I've added a scheme based on my own judgement there (not tested). - felix_fdb_add() and felix_fdb_del() did not have locking, I've added and tested a locking scheme there. - mt7530_port_fdb_add() and mt7530_port_fdb_del() take mutex_lock(&priv->reg_mutex), so they should be safe. - gswip_port_fdb() did not have locking, so I've added a non-expert locking scheme based on my own judgement (not tested). - lan9303_alr_add_port() and lan9303_alr_del_port() take mutex_lock(&chip->alr_mutex) so they should be safe. - sja1105_fdb_add() and sja1105_fdb_del() did not have locking, I've added and tested a locking scheme. Changes in v3: Unlock arl_mutex only once in b53_fdb_dump(). Changes in v4: - Use __must_hold in ocelot and b53 - Add missing mutex_init in lantiq_gswip - Clean up the selftest a bit. Changes in v5: - Replace __must_hold with a comment. - Add a new patch (01/10). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25selftests: net: dsa: add a stress test for unlocked FDB operationsVladimir Oltean
This test is a bit strange in that it is perhaps more manual than others: it does not transmit a clear OK/FAIL verdict, because user space does not have synchronous feedback from the kernel. If a hardware access fails, it is in deferred context. Nonetheless, on sja1105 I have used it successfully to find and solve a concurrency issue, so it can be used as a starting point for other driver maintainers too. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25selftests: lib: forwarding: allow tests to not require mz and jqVladimir Oltean
These programs are useful, but not all selftests require them. Additionally, on embedded boards without package management (things like buildroot), installing mausezahn or jq is not always as trivial as downloading a package from the web. So it is actually a bit annoying to require programs that are not used. Introduce options that can be set by scripts to not enforce these dependencies. For compatibility, default to "yes". Cc: Nikolay Aleksandrov <nikolay@nvidia.com> Cc: Ido Schimmel <idosch@nvidia.com> Cc: Guillaume Nault <gnault@redhat.com> Cc: Po-Hsu Lin <po-hsu.lin@canonical.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: dsa: drop rtnl_lock from dsa_slave_switchdev_event_workVladimir Oltean
After talking with Ido Schimmel, it became clear that rtnl_lock is not actually required for anything that is done inside the SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE deferred work handlers. The reason why it was probably added by Arkadi Sharshevsky in commit c9eb3e0f8701 ("net: dsa: Add support for learning FDB through notification") was to offer the same locking/serialization guarantees as .ndo_fdb_{add,del} and avoid reworking any drivers. DSA has implemented .ndo_fdb_add and .ndo_fdb_del until commit b117e1e8a86d ("net: dsa: delete dsa_legacy_fdb_add and dsa_legacy_fdb_del") - that is to say, until fairly recently. But those methods have been deleted, so now we are free to drop the rtnl_lock as well. Note that exposing DSA switch drivers to an unlocked method which was previously serialized by the rtnl_mutex is a potentially dangerous affair. Driver writers couldn't ensure that their internal locking scheme does the right thing even if they wanted. We could err on the side of paranoia and introduce a switch-wide lock inside the DSA framework, but that seems way overreaching. Instead, we could check as many drivers for regressions as we can, fix those first, then let this change go in once it is assumed to be fairly safe. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: dsa: introduce locking for the address lists on CPU and DSA portsVladimir Oltean
Now that the rtnl_mutex is going away for dsa_port_{host_,}fdb_{add,del}, no one is serializing access to the address lists that DSA keeps for the purpose of reference counting on shared ports (CPU and cascade ports). It can happen for one dsa_switch_do_fdb_del to do list_del on a dp->fdbs element while another dsa_switch_do_fdb_{add,del} is traversing dp->fdbs. We need to avoid that. Currently dp->mdbs is not at risk, because dsa_switch_do_mdb_{add,del} still runs under the rtnl_mutex. But it would be nice if it would not depend on that being the case. So let's introduce a mutex per port (the address lists are per port too) and share it between dp->mdbs and dp->fdbs. The place where we put the locking is interesting. It could be tempting to put a DSA-level lock which still serializes calls to .port_fdb_{add,del}, but it would still not avoid concurrency with other driver code paths that are currently under rtnl_mutex (.port_fdb_dump, .port_fast_age). So it would add a very false sense of security (and adding a global switch-wide lock in DSA to resynchronize with the rtnl_lock is also counterproductive and hard). So the locking is intentionally done only where the dp->fdbs and dp->mdbs lists are traversed. That means, from a driver perspective, that .port_fdb_add will be called with the dp->addr_lists_lock mutex held on the CPU port, but not held on user ports. This is done so that driver writers are not encouraged to rely on any guarantee offered by dp->addr_lists_lock. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: dsa: lantiq_gswip: serialize access to the PCE registersVladimir Oltean
The GSWIP switch accesses various bridging layer tables (VLANs, FDBs, forwarding rules) indirectly through PCE registers. These hardware accesses are non-atomic, being comprised of several register reads and writes. These accesses are currently serialized by the rtnl_lock, but DSA is changing its driver API and that lock will no longer be held when calling ->port_fdb_add() and ->port_fdb_del(). So this driver needs to serialize the access to the PCE registers using its own locking scheme. This patch adds that. Note that the driver also uses the gswip_pce_load_microcode() function to load a static configuration for the packet classification engine into a table using the same registers. It is currently not protected, but since that configuration is only done from the dsa_switch_ops :: setup method, there is no risk of it being concurrent with other operations. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: dsa: b53: serialize access to the ARL tableVladimir Oltean
The b53 driver performs non-atomic transactions to the ARL table when adding, deleting and reading FDB and MDB entries. Traditionally these were all serialized by the rtnl_lock(), but now it is possible that DSA calls ->port_fdb_add and ->port_fdb_del without holding that lock. So the driver must have its own serialization logic. Add a mutex and hold it from all entry points (->port_fdb_{add,del,dump}, ->port_mdb_{add,del}). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: mscc: ocelot: serialize access to the MAC tableVladimir Oltean
DSA would like to remove the rtnl_lock from its SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE handlers, and the felix driver uses the same MAC table functions as ocelot. This means that the MAC table functions will no longer be implicitly serialized with respect to each other by the rtnl_mutex, we need to add a dedicated lock in ocelot for the non-atomic operations of selecting a MAC table row, reading/writing what we want and polling for completion. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: dsa: sja1105: serialize access to the dynamic config interfaceVladimir Oltean
The sja1105 hardware seems as concurrent as can be, but when we create a background script that adds/removes a rain of FDB entries without the rtnl_mutex taken, then in parallel we do another operation like run 'bridge fdb show', we can notice these errors popping up: sja1105 spi2.0: port 2 failed to read back entry for 00:01:02:03:00:40 vid 0: -ENOENT sja1105 spi2.0: port 2 failed to add 00:01:02:03:00:40 vid 0 to fdb: -2 sja1105 spi2.0: port 2 failed to read back entry for 00:01:02:03:00:46 vid 0: -ENOENT sja1105 spi2.0: port 2 failed to add 00:01:02:03:00:46 vid 0 to fdb: -2 Luckily what is going on does not require a major rework in the driver. The sja1105_dynamic_config_read() function sends multiple SPI buffers to the peripheral until the operation completes. We should not do anything until the hardware clears the VALID bit. But since there is no locking (i.e. right now we are implicitly serialized by the rtnl_mutex, but if we remove that), it might be possible that the process which performs the dynamic config read is preempted and another one performs a dynamic config write. What will happen in that case is that sja1105_dynamic_config_read(), when it resumes, expects to see VALIDENT set for the entry it reads back. But it won't. This can be corrected by introducing a mutex for serializing SPI accesses to the dynamic config interface which should be atomic with respect to each other. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25net: dsa: sja1105: wait for dynamic config command completion on writes tooVladimir Oltean
The hardware manual says that software should attempt a new dynamic config access (be it a a write or a read-back) only while the VALID bit is cleared. The VALID bit is set by software to 1, and it remains set as long as the hardware is still processing the request. Currently the driver only polls for the command completion only for reads, because that's when we need the actual data read back. Writes have been more or less "asynchronous", although this has never been an observable issue. This change makes sja1105_dynamic_config_write poll the VALID bit as well, to absolutely ensure that a follow-up access to the static config finds the VALID bit cleared. So VALID means "work in progress", while VALIDENT means "entry being read is valid". On reads we check the VALIDENT bit too, while on writes that bit is not always defined. So we need to factor it out of the loop, and make the loop provide back the unpacked command structure, so that sja1105_dynamic_config_read can check the VALIDENT bit. The change also attempts to convert the open-coded loop to use the read_poll_timeout macro, since I know this will come up during review. It's more code, but hey, it uses read_poll_timeout! Tested on SJA1105T, SJA1105S, SJA1110A. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>