summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-10-31rtnetlink: Honour NLM_F_ECHO flag in rtnl_delete_linkHangbin Liu
This patch use the new helper unregister_netdevice_many_notify() for rtnl_delete_link(), so that the kernel could reply unicast when userspace set NLM_F_ECHO flag to request the new created interface info. At the same time, the parameters of rtnl_delete_link() need to be updated since we need nlmsghdr and portid info. Suggested-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-31rtnetlink: Honour NLM_F_ECHO flag in rtnl_newlink_createHangbin Liu
This patch pass the netlink header message in rtnl_newlink_create() to the new updated rtnl_configure_link(), so that the kernel could reply unicast when userspace set NLM_F_ECHO flag to request the new created interface info. Suggested-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-31net: add new helper unregister_netdevice_many_notifyHangbin Liu
Add new helper unregister_netdevice_many_notify(), pass netlink message header and portid, which could be used to notify userspace when flag NLM_F_ECHO is set. Make the unregister_netdevice_many() as a wrapper of new function unregister_netdevice_many_notify(). Suggested-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-31rtnetlink: pass netlink message header and portid to rtnl_configure_link()Hangbin Liu
This patch pass netlink message header and portid to rtnl_configure_link() All the functions in this call chain need to add the parameters so we can use them in the last call rtnl_notify(), and notify the userspace about the new link info if NLM_F_ECHO flag is set. - rtnl_configure_link() - __dev_notify_flags() - rtmsg_ifinfo() - rtmsg_ifinfo_event() - rtmsg_ifinfo_build_skb() - rtmsg_ifinfo_send() - rtnl_notify() Also move __dev_notify_flags() declaration to net/core/dev.h, as Jakub suggested. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-31net: dpaa2: Add some debug prints on deferred probeSean Anderson
When this device is deferred, there is often no way to determine what the cause was. Add some debug prints to make it easier to figure out what is blocking the probe. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Link: https://lore.kernel.org/r/20221027190005.400839-1-sean.anderson@seco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-31net: mvneta: Remove unused variable iColin Ian King
Variable i is just being incremented and it's never used anywhere else. The variable and the increment are redundant so remove it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-31Merge branch 'ptp-adjfine'David S. Miller
Jacob Keller says: ==================== ptp: convert drivers to .adjfine Many drivers implementing PTP have not yet migrated to the new .adjfine frequency adjustment implementation. A handful of these drivers use hardware with a simple increment value which is adjusted by multiplying by the adjustment factor and then dividing by 1 billion. This calculation is very easy to convert to .adjfine, by simply updating the divisor. Introduce new helper functions, diff_by_scaled_ppm and adjust_by_scaled_ppm which perform the most common calculations used by drivers for this purpose. The adjust_by_scaled_ppm takes the base increment and scaled PPM value, and calculates the new increment to use. A few drivers need the difference and direction rather than a raw increment value. The diff_by_scaled_ppm calculates the difference and returns true if it should be a subtraction, false otherwise. This most closely aligns with existing driver implementations. I previously submitted v1 of this series at [1], and got some feedback only on a handful of drivers. In the interest of merging the changes which have received feedback, I've dropped the following drivers out of this send: * ptp_phc * ptp_ipx46x * tg3 * hclge * stmac * cpts I plan to submit those drivers changes again at a later date. As before, there are some drivers which are not trivial to convert to the new helper functions. While they may be able to work, their implementation is different and I lack the hardware or datasheets to determine what the correct implementation would be. * drivers/net/ethernet/broadcom/bnx2x * drivers/net/ethernet/broadcom/bnxt * drivers/net/ethernet/cavium/liquidio * drivers/net/ethernet/chelsio/cxgb4 * drivers/net/ethernet/freescale * drivers/net/ethernet/qlogic/qed * drivers/net/ethernet/qlogic/qede * drivers/net/ethernet/sfc * drivers/net/ethernet/sfc/siena * drivers/net/ethernet/ti/am65-cpts.c * drivers/ptp/ptp_dte.c My end goal is to drop the .adjfreq implementation entirely, and to that end I plan on modifying these drivers in the future to directly use scaled_ppm_to_ppb as the simplest method to convert them. Changes since v2: * Rebased to allow landing in 6.2 * Added Richard's Acked-by Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Dexuan Cui <decui@microsoft.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Siva Reddy Kallam <siva.kallam@broadcom.com> Cc: Prashant Sreedharan <prashant@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Cc: Yisen Zhuang <yisen.zhuang@huawei.com> Cc: Salil Mehta <salil.mehta@huawei.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Tony Nguyen <anthony.l.nguyen@intel.com> Cc: Tariq Toukan <tariqt@nvidia.com> Cc: Saeed Mahameed <saeedm@nvidia.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: Bryan Whitehead <bryan.whitehead@microchip.com> Cc: Sergey Shtylyov <s.shtylyov@omp.ru> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Jose Abreu <joabreu@synopsys.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Vivek Thampi <vithampi@vmware.com> Cc: VMware PV-Drivers Reviewers <pv-drivers@vmware.com> Cc: Jie Wang <wangjie125@huawei.com> Cc: Jacob Keller <jacob.e.keller@intel.com> Cc: Guangbin Huang <huangguangbin2@huawei.com> Cc: Eran Ben Elisha <eranbe@nvidia.com> Cc: Aya Levin <ayal@nvidia.com> Cc: Cai Huoqing <cai.huoqing@linux.dev> Cc: Biju Das <biju.das.jz@bp.renesas.com> Cc: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Cc: Phil Edworthy <phil.edworthy@renesas.com> Cc: Jiasheng Jiang <jiasheng@iscas.ac.cn> Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Wan Jiabing <wanjiabing@vivo.com> Cc: Lv Ruyi <lv.ruyi@zte.com.cn> Cc: Arnd Bergmann <arnd@arndb.de> ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-31ptp: xgbe: convert to .adjfine and adjust_by_scaled_ppmJacob Keller
The xgbe implementation of .adjfreq is implemented in terms of a straight forward "base * ppb / 1 billion" calculation. Convert this driver to .adjfine and use adjust_by_scaled_ppm to calculate the new addend value. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-31ptp: ravb: convert to .adjfine and adjust_by_scaled_ppmJacob Keller
The ravb implementation of .adjfreq is implemented in terms of a straight forward "base * ppb / 1 billion" calculation. Convert this driver to .adjfine and use the adjust_by_scaled_ppm helper function to calculate the new addend. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Cc: Sergey Shtylyov <s.shtylyov@omp.ru> Cc: Biju Das <biju.das.jz@bp.renesas.com> Cc: Phil Edworthy <phil.edworthy@renesas.com> Cc: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Cc: linux-renesas-soc@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-31ptp: lan743x: use diff_by_scaled_ppm in .adjfine implementationJacob Keller
Update the lan743x driver to use the recently added diff_by_scaled_ppm helper function. This reduces the amount of code required in lan743x_ptp.c driver file. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Cc: Bryan Whitehead <bryan.whitehead@microchip.com> Cc: UNGLinuxDriver@microchip.com Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-31ptp: lan743x: remove .adjfreq implementationJacob Keller
The lan743x driver implements both .adjfreq and .adjfine, but the core PTP subsystem prefers .adjfine if implemented. There is no reason to carry a .adjfreq implementation, so we can remove it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Cc: Bryan Whitehead <bryan.whitehead@microchip.com> Cc: UNGLinuxDriver@microchip.com Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-31ptp: mlx5: convert to .adjfine and adjust_by_scaled_ppmJacob Keller
The mlx5 implementation of .adjfreq is implemented in terms of a straight forward "base * ppb / 1 billion" calculation. Convert this to the .adjfine interface and use adjust_by_scaled_ppm for the calculation of the new mult value. Note that the mlx5_ptp_adjfreq_real_time function expects input in terms of ppb, so use the scaled_ppm_to_ppb to convert before passing to this function. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Shirly Ohnona <shirlyo@nvidia.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Cc: Gal Pressman <gal@nvidia.com> Cc: Saeed Mahameed <saeedm@nvidia.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: Aya Levin <ayal@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-31ptp: mlx4: convert to .adjfine and adjust_by_scaled_ppmJacob Keller
The mlx4 implementation of .adjfreq is implemented in terms of a straight forward "base * ppb / 1 billion" calculation. Convert this driver to .adjfine and use adjust_by_scaled_ppm to perform the calculation. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Cc: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-31drivers: convert unsupported .adjfreq to .adjfineJacob Keller
A few PTP drivers implement a .adjfreq handler which indicates the operation is not supported. Convert all of these to .adjfine. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Dexuan Cui <decui@microsoft.com> Cc: Vivek Thampi <vithampi@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-31ptp: introduce helpers to adjust by scaled parts per millionJacob Keller
Many drivers implement the .adjfreq or .adjfine PTP op function with the same basic logic: 1. Determine a base frequency value 2. Multiply this by the abs() of the requested adjustment, then divide by the appropriate divisor (1 billion, or 65,536 billion). 3. Add or subtract this difference from the base frequency to calculate a new adjustment. A few drivers need the difference and direction rather than the combined new increment value. I recently converted the Intel drivers to .adjfine and the scaled parts per million (65.536 parts per billion) logic. To avoid overflow with minimal loss of precision, mul_u64_u64_div_u64 was used. The basic logic used by all of these drivers is very similar, and leads to a lot of duplicate code to perform the same task. Rather than keep this duplicate code, introduce diff_by_scaled_ppm and adjust_by_scaled_ppm. These helper functions calculate the difference or adjustment necessary based on the scaled parts per million input. The diff_by_scaled_ppm function returns true if the difference should be subtracted, and false otherwise. Update the Intel drivers to use the new helper functions. Other vendor drivers will be converted to .adjfine and this helper function in the following changes. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-31ptp: add missing documentation for parametersJacob Keller
The ptp_find_pin_unlocked function and the ptp_system_timestamp structure didn't document their parameters and fields. Fix this. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-31net: phy: Add driver for Motorcomm yt8521 gigabit ethernet phyFrank
Add a driver for the motorcomm yt8521 gigabit ethernet phy. We have verified the driver on StarFive VisionFive development board, which is developed by Shanghai StarFive Technology Co., Ltd.. On the board, yt8521 gigabit ethernet phy works in utp mode, RGMII interface, supports 1000M/100M/10M speeds, and wol(magic package). Signed-off-by: Frank <Frank.Sae@motor-comm.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-31net: microchip: sparx5: kunit test: change test_callbacks and test_vctrl to ↵Yang Yingliang
static test_callbacks and test_vctrl are only used in vcap_api_kunit.c now, change them to static. Fixes: 67d637516fa9 ("net: microchip: sparx5: Adding KUNIT test for the VCAP API") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-31net: geneve: fix array of flexible structures warningsJakub Kicinski
New compilers don't like flexible array of flexible structs: include/net/geneve.h:62:34: warning: array of flexible structures Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-31net: hns: hnae: remove unnecessary __module_get() and module_put()Yang Yingliang
hnae_ae_register() is called from hns_dsaf_probe(), the refcount of module hnae has already be got in resolve_symbol() while calling the function, so the __module_get()/module_put() can be removed. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-31netlink: split up copies in the ack constructionJakub Kicinski
Clean up the use of unsafe_memcpy() by adding a flexible array at the end of netlink message header and splitting up the header and data copies. Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-31drivers: net: convert to boolean for the mac_managed_pm flagDenis Kirjanov
Signed-off-by: Dennis Kirjanov <dkirjanov@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-31dt-bindings: net: snps,dwmac: Document queue config subnodesSebastian Reichel
The queue configuration is referenced by snps,mtl-rx-config and snps,mtl-tx-config. Some in-tree DTs and the example put the referenced config nodes directly beneath the root node, but most in-tree DTs put it as child node of the dwmac node. This adds proper description for this setup, which has the advantage of validating the queue configuration node content. The example is also updated to use the sub-node style, incl. the axi bus configuration node, which got the same treatment as the queues config in 5361660af6d3 ("dt-bindings: net: snps,dwmac: Document stmmac-axi-config subnode"). Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-30net: remove unused netdev_unregistering()Juhee Kang
Currently, use dev->reg_state == NETREG_UNREGISTERING to check the status which is NETREG_UNREGISTERING, rather than using netdev_unregistering. Also, A helper function which is netdev_unregistering on nedevice.h is no longer used. Thus, netdev_unregistering removes from netdevice.h. Signed-off-by: Juhee Kang <claudiajkang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28Merge tag 'mlx5-updates-2022-10-24' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2022-10-24 SW steering updates from Yevgeny Kliteynik: 1) 1st Four patches: small fixes / optimizations for SW steering: - Patch 1: Don't abort destroy flow if failed to destroy table - continue and free everything else. - Patches 2 and 3 deal with fast teardown: + Skip sync during fast teardown, as PCI device is not there any more. + Check device state when polling CQ - otherwise SW steering keeps polling the CQ forever, because nobody is there to flush it. - Patch 4: Removing unneeded function argument. 2) Deal with the hiccups that we get during rules insertion/deletion, which sometimes reach 1/4 of a second. While insertion/deletion rate improvement was not the focus here, it still is a by-product of removing these hiccups. Another by-product is the reduced standard deviation in measuring the duration of rules insertion/deletion bursts. In the testing we add K rules (warm-up phase), and then continuously do insertion/deletion bursts of N rules. During the test execution, the driver measures hiccups (amount and duration) and total time for insertion/deletion of a batch of rules. Here are some numbers, before and after these patches: +--------------------------------------------+-----------------+----------------+ | | Create rules | Delete rules | | +--------+--------+--------+-------+ | | Before | After | Before | After | +--------------------------------------------+--------+--------+--------+-------+ | Max hiccup [msec] | 253 | 42 | 254 | 68 | +--------------------------------------------+--------+--------+--------+-------+ | Avg duration of 10K rules add/remove [msec]| 140.07 | 124.32 | 106.99 | 99.51 | +--------------------------------------------+--------+--------+--------+-------+ | Num of hiccups per 100K rules add/remove | 7.77 | 7.97 | 12.60 | 11.57 | +--------------------------------------------+--------+--------+--------+-------+ | Avg hiccup duration [msec] | 36.92 | 33.25 | 36.15 | 33.74 | +--------------------------------------------+--------+--------+--------+-------+ - Patch 5: Allocate a short array on stack instead of dynamically- it is destroyed at the end of the function. - Patch 6: Rather than cleaning the corresponding chunk's section of ste_arrays on chunk deletion, initialize these areas upon chunk creation. Chunk destruction tend to come in large batches (during pool syncing), so instead of doing huge memory initialization during pool sync, we amortize this by doing small initsializations on chunk creation. - Patch 7: In order to simplifies error flow and allows cleaner addition of new pools, handle creation/destruction of all the domain's memory pools and other memory-related fields in a separate init/uninit functions. - Patch 8: During rehash, write each table row immediately instead of waiting for the whole table to be ready and writing it all - saves allocations of ste_send_info structures and improves performance. - Patch 9: Instead of allocating/freeing send info objects dynamically, manage them in pool. The number of send info objects doesn't depend on number of rules, so after pre-populating the pool with an initial batch of send info objects, the pool is not expected to grow. This way we save alloc/free during writing STEs to ICM, which by itself can sometimes take up to 40msec. - Patch 10: Allocate icm_chunks from their own slab allocator, which lowered the alloc/free "hiccups" frequency. - Patch 11: Similar to patch 9, allocate htbl from its own slab allocator. - Patch 12: Lower sync threshold for ICM hot memory - set the threshold for sync to 1/4 of the pool instead of 1/2 of the pool. Although we will have more syncs, each sync will be shorter and will help with insertion rate stability. Also, notice that the overall number of hiccups wasn't increased due to all the other patches. - Patch 13: Keep track of hot ICM chunks in an array instead of list. After steering sync, we traverse the hot list and finally free all the chunks. It appears that traversing a long list takes unusually long time due to cache misses on many entries, which causes a big "hiccup" during rule insertion. This patch replaces the list with pre-allocated array that stores only the bookkeeping information that is needed to later free the chunks in its buddy allocator. - Patch 14: Remove the unneeded buddy used_list - we don't need to have the list of used chunks, we only need the total amount of used memory. * tag 'mlx5-updates-2022-10-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: DR, Remove the buddy used_list net/mlx5: DR, Keep track of hot ICM chunks in an array instead of list net/mlx5: DR, Lower sync threshold for ICM hot memory net/mlx5: DR, Allocate htbl from its own slab allocator net/mlx5: DR, Allocate icm_chunks from their own slab allocator net/mlx5: DR, Manage STE send info objects in pool net/mlx5: DR, In rehash write the line in the entry immediately net/mlx5: DR, Handle domain memory resources init/uninit separately net/mlx5: DR, Initialize chunk's ste_arrays at chunk creation net/mlx5: DR, For short chains of STEs, avoid allocating ste_arr dynamically net/mlx5: DR, Remove unneeded argument from dr_icm_chunk_destroy net/mlx5: DR, Check device state when polling CQ net/mlx5: DR, Fix the SMFS sync_steering for fast teardown net/mlx5: DR, In destroy flow, free resources even if FW command failed ==================== Link: https://lore.kernel.org/r/20221027145643.6618-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28Merge branch 'net-ipa-start-adding-ipa-v5-0-functionality'Jakub Kicinski
Alex Elder says: ==================== net: ipa: start adding IPA v5.0 functionality The biggest change for IPA v5.0 is that it supports more than 32 endpoints. However there are two other unrelated changes: - The STATS_TETHERING memory region is not required - Filter tables no longer support a "global" filter Beyond this, refactoring some code makes supporting more than 32 endpoints (in an upcoming series) easier. So this series includes a few other changes (not in this order): - The maximum endpoint ID in use is determined during config - Loops over all endpoints only involve those in use - Endpoints IDs and their directions are checked for validity differently to simplify comparison against the maximum ==================== Link: https://lore.kernel.org/r/20221027122632.488694-1-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: ipa: record and use the number of defined endpoint IDsAlex Elder
Define a new field in the IPA structure that records the maximum number of entries that will be used in the IPA endpoint array. Use that value rather than IPA_ENDPOINT_MAX to determine the end condition for two loops that iterate over all endpoints. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: ipa: determine the maximum endpoint IDAlex Elder
Each endpoint ID has an entry in the IPA endpoint array. But the size of that array is defined at compile time. Instead, rename ipa_endpoint_data_valid() to be ipa_endpoint_max() and have it return the maximum endpoint ID defined in configuration data. That function will still validate configuration data. Zero is returned on error; it's a valid endpoint ID, but we need more than one, so it can't be the maximum. The next patch makes use of the returned maximum value. Finally, rename the "initialized" mask of endpoints defined by configuration data to be "defined". Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: ipa: refactor endpoint loopsAlex Elder
Change two functions that iterate over all endpoints to use while loops, using "endpoint_id" as the index variables in both spots. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: ipa: more completely check endpoint validityAlex Elder
Ensure all defined TX endpoints are in the range [0, CONS_PIPES) and defined RX endpoints are within [PROD_LOWEST, PROD_LOWEST+PROD_PIPES). Modify the way local variables are used to make the checks easier to understand. Check for each endpoint being in valid range in the loop, and drop the logical-AND check of initialized against unavailable IDs. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: ipa: no more global filtering starting with IPA v5.0Alex Elder
IPA v5.0 eliminates the global filter table entry. As a result, there is no need to shift the filtered endpoint bitmap when it is written to IPA local memory. Update comments to explain this. Also delete a redundant block of comments above the function. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: ipa: change an IPA v5.0 memory requirementAlex Elder
Don't require IPA v5.0 to have a STATS_TETHERING memory region. Downstream defines its size to 0, so it apparently is unused. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: ipa: define IPA v5.0Alex Elder
In preparation for adding support for IPA v5.0, define it as an understood version. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net/packet: add PACKET_FANOUT_FLAG_IGNORE_OUTGOINGWillem de Bruijn
Extend packet socket option PACKET_IGNORE_OUTGOING to fanout groups. The socket option sets ptype.ignore_outgoing, which makes dev_queue_xmit_nit skip the socket. When the socket joins a fanout group, the option is not reflected in the struct ptype of the group. dev_queue_xmit_nit only tests the fanout ptype, so the flag is ignored once a socket joins a fanout group. Inheriting the option from a socket would change established behavior. Different sockets in the group can set different flags, and can also change them at runtime. Testing in packet_rcv_fanout defeats the purpose of the original patch, which is to avoid skb_clone in dev_queue_xmit_nit (esp. for MSG_ZEROCOPY packets). Instead, introduce a new fanout group flag with the same behavior. Tested with https://github.com/wdebruij/kerneltools/blob/master/tests/test_psock_fanout_ignore_outgoing.c Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20221027211014.3581513-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28ice: Add additional CSR registers to ETHTOOL_GREGSLukasz Czapnik
In the event of a Tx hang it can be useful to read a variety of hardware registers to capture some state about why the transmit queue got stuck. Extend the ETHTOOL_GREGS dump provided by the ice driver with several CSR registers that provide such relevant information regarding the hardware Tx state. This enables capturing relevant data to enable debugging such a Tx hang. Signed-off-by: Lukasz Czapnik <lukasz.czapnik@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Link: https://lore.kernel.org/r/20221027104239.1691549-1-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28Merge branch 'clean-up-sfp-register-definitions'Jakub Kicinski
Russell King says: ==================== Clean up SFP register definitions This two-part patch series cleans up the SFP register definitions by 1. converting them from hex to decimal, as all the definitions in the documents use decimal, this makes it easier to cross-reference. 2. moving the bit definitions for each register along side their register address definition ==================== Link: https://lore.kernel.org/r/Y1qFvaDlLVM1fHdG@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: sfp: move field definitions along side register indexRussell King (Oracle)
Just as we do for the A2h enum, arrange the A0h enum to have the field definitions next to their corresponding register index. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: sfp: convert register indexes from hex to decimalRussell King (Oracle)
The register indexes in the standards are in decimal rather than hex, so lets specify them in decimal in the header file so we can easily cross-reference without converting between hex and decimal. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28Merge branch 'net-mtk_eth_soc-improve-pcs-implementation'Jakub Kicinski
Russell King says: ==================== net: mtk_eth_soc: improve PCS implementation As a result of invesigations from Frank Wunderlich, we know a lot more about the Mediatek "SGMII" PCS block, and can implement the PCS support correctly. This series achieves that, and Frank has tested the final result and reports that it works for him. The series could do with further testing by others, but I suspect that is unlikely to happen until it is merged based on past performances with this driver. Briefly, the patches in order: 1. Add a new helper to get the link timer duration in nanoseconds 2. Add definitions for the newly discovered registers and updates to bit definitions, including bitmasks for the BMCR, BMSR and two advertisement registers. 3. Remove unnecessary/unused error handling (functions always returning zero.) 4. Adding the missing pcs_get_state() implementation. 5. Converting the code to use regmap_update_bits() rather than open-coding read-modify-write sequences. 6. Adding out-of-band speed and duplex forcing for all non-inband modes not just the 802.3z link modes the code currently does. 7. Moving the release of the PHY power down to the main pcs_config() function. 8. Moving the interface speed selection to the main pcs_config() function. 9. Adding advertisement programming. 10. Adding correct link timer programming using the new helper in the first patch. 11. Adding support for 802.3z negotiation. There is one remaining issue - when configuring the PCS for in-band, for some reason the AN restart bit is always set. This should not be necessary, but requires further investigation with the hardware to find out whether it is really necessary. I suspect this was a work around for a previous poor implementation. ==================== Link: https://lore.kernel.org/r/Y1qDMw+DJLAJHT40@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: mtk_eth_soc: add support for in-band 802.3z negotiationRussell King (Oracle)
As a result of help from Frank Wunderlich to investigate and test, we now know how to program this PCS for in-band 802.3z negotiation. Add support for this by moving the contents of the two functions into the common mtk_pcs_config() function and adding the register settings for 802.3z negotiation. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: mtk_eth_soc: move and correct link timer programmingRussell King (Oracle)
Program the link timer appropriately for the interface mode being used, using the newly introduced phylink helper that provides the nanosecond link timer interval. The intervals are 1.6ms for SGMII based protocols and 10ms for 802.3z based protocols. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: mtk_eth_soc: add advertisement programmingRussell King (Oracle)
Program the advertisement into the mtk PCS block. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: mtk_eth_soc: move interface speed selectionRussell King (Oracle)
Move the selection of the underlying interface speed to the pcs_config function, so we always program the interface speed. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: mtk_eth_soc: move PHY power upRussell King (Oracle)
The PHY power up is common to both configuration paths, so move it into the parent function. We need to do this for all serdes modes. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: mtk_eth_soc: add out of band forcing of speed and duplex in pcs_link_upRussell King (Oracle)
Add support for forcing the link speed and duplex setting in the pcs_link_up() method for out of band modes, which will be useful when we finish converting the pcs_config() method. Until then, we still have to force duplex for 802.3z modes to work correctly. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: mtk_eth_soc: convert mtk_sgmii to use regmap_update_bits()Russell King (Oracle)
mtk_sgmii does a lot of read-modify-write operations, for which there is a specific regmap function. Use this function instead of open-coding the operations. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: mtk_eth_soc: add pcs_get_state() implementationRussell King (Oracle)
Add a pcs_get_state() implementation which uses the advertisements to compute the resulting link modes, and BMSR contents to determine negotiation and link status. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: mtk_eth_soc: eliminate unnecessary error handlingRussell King (Oracle)
The functions called by the pcs_config() method always return zero, so there is no point trying to handle an error from these functions. Make these functions void, eliminate the "err" variable and simply return zero from the pcs_config() function itself. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: mtk_eth_soc: add definitions for PCSRussell King (Oracle)
As a result of help from Frank Wunderlich to investigate and test, we know a bit more about the PCS on the Mediatek platforms. Update the definitions from this investigation. This PCS appears similar, but not identical to the Lynx PCS. Although not included in this patch, but for future reference, the PHY ID registers at offset 4 read as 0x4d544950 'MTIP'. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: phylink: add phylink_get_link_timer_ns() helperRussell King (Oracle)
Add a helper to convert the PHY interface mode to the required link timer setting as stated by the appropriate standard. Inappropriate interface modes return an error. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>