summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-04-20convert sgx_set_attribute() to fdget()/fdput()Al Viro
Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2023-04-20convert setns(2) to fdget()/fdput()Al Viro
Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2023-04-20net: skbuff: update and rename __kfree_skb_defer()Jakub Kicinski
__kfree_skb_defer() uses the old naming where "defer" meant slab bulk free/alloc APIs. In the meantime we also made __kfree_skb_defer() feed the per-NAPI skb cache, which implies bulk APIs. So take away the 'defer' and add 'napi'. While at it add a drop reason. This only matters on the tx_action path, if the skb has a frag_list. But getting rid of a SKB_DROP_REASON_NOT_SPECIFIED seems like a net benefit so why not. Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://lore.kernel.org/r/20230420020005.815854-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-20eth: mlx5: avoid iterator use outside of a loopJakub Kicinski
Fix the following warning about risky iterator use: drivers/net/ethernet/mellanox/mlx5/core/eq.c:1010 mlx5_comp_irq_get_affinity_mask() warn: iterator used outside loop: 'eq' Acked-by: Saeed Mahameed <saeed@kernel.org> Link: https://lore.kernel.org/r/20230420015802.815362-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-20flow_dissector: Address kdoc warningsSimon Horman
Address a number of warnings flagged by ./scripts/kernel-doc -none include/net/flow_dissector.h include/net/flow_dissector.h:23: warning: Function parameter or member 'addr_type' not described in 'flow_dissector_key_control' include/net/flow_dissector.h:23: warning: Function parameter or member 'flags' not described in 'flow_dissector_key_control' include/net/flow_dissector.h:46: warning: Function parameter or member 'padding' not described in 'flow_dissector_key_basic' include/net/flow_dissector.h:145: warning: Function parameter or member 'tipckey' not described in 'flow_dissector_key_addrs' include/net/flow_dissector.h:157: warning: cannot understand function prototype: 'struct flow_dissector_key_arp ' include/net/flow_dissector.h:171: warning: cannot understand function prototype: 'struct flow_dissector_key_ports ' include/net/flow_dissector.h:203: warning: cannot understand function prototype: 'struct flow_dissector_key_icmp ' Also improve indentation on adjacent lines to those changed to address the above. No functional changes intended. Signed-off-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230419-flow-dissector-kdoc-v1-1-1aa0cca1118b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-20gve: update MAINTAINERSJeroen de Borst
This reflects role changes in our team. Signed-off-by: Jeroen de Borst <jeroendb@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Link: https://lore.kernel.org/r/20230419210558.1893400-1-jeroendb@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-20Merge tag 'drm-fixes-2023-04-21' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "This is the regular and hopefully last round of fixes for 6.3. Pretty small, a few amdgpu, one i915, one nouveau, one rockchip and one gpu scheduler fix: nouveau: - fix dma-resv timeout rockchip: - fix suspend/resume sched: - fix timeout handling i915: - Fix fast wake AUX sync len amdgpu: - GPU reset fix - DCN 3.1.5 line buffer fix - Display fix for single channel memory configs - Fix a possible divide by 0" * tag 'drm-fixes-2023-04-21' of git://anongit.freedesktop.org/drm/drm: drm/amd/display: fix a divided-by-zero error drm/amd/display: limit timing for single dimm memory drm/amd/display: set dcn315 lb bpp to 48 drm/amdgpu: Fix desktop freezed after gpu-reset drm/rockchip: vop2: Use regcache_sync() to fix suspend/resume drm/nouveau: fix incorrect conversion to dma_resv_wait_timeout() drm/rockchip: vop2: fix suspend/resume drm/i915: Fix fast wake AUX sync len drm/sched: Check scheduler ready before calling timeout handling
2023-04-20page_pool: unlink from napi during destroyJakub Kicinski
Jesper points out that we must prevent recycling into cache after page_pool_destroy() is called, because page_pool_destroy() is not synchronized with recycling (some pages may still be outstanding when destroy() gets called). I assumed this will not happen because NAPI can't be scheduled if its page pool is being destroyed. But I missed the fact that NAPI may get reused. For instance when user changes ring configuration driver may allocate a new page pool, stop NAPI, swap, start NAPI, and then destroy the old pool. The NAPI is running so old page pool will think it can recycle to the cache, but the consumer at that point is the destroy() path, not NAPI. To avoid extra synchronization let the drivers do "unlinking" during the "swap" stage while NAPI is indeed disabled. Fixes: 8c48eea3adf3 ("page_pool: allow caching from safely localized NAPI") Reported-by: Jesper Dangaard Brouer <jbrouer@redhat.com> Link: https://lore.kernel.org/all/e8df2654-6a5b-3c92-489d-2fe5e444135f@redhat.com/ Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/r/20230419182006.719923-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-20net: phy: fix circular LEDS_CLASS dependenciesArnd Bergmann
The CONFIG_PHYLIB symbol is selected by a number of device drivers that need PHY support, but it now has a dependency on CONFIG_LEDS_CLASS, which may not be enabled, causing build failures. Avoid the risk of missing and circular dependencies by guarding the phylib LED support itself in another Kconfig symbol that can only be enabled if the dependency is met. This could be made a hidden symbol and always enabled when both CONFIG_OF and CONFIG_LEDS_CLASS are reachable from the phylib, but there may be an advantage in having users see this option when they have a misconfigured kernel without built-in LED support. Fixes: 01e5b728e9e4 ("net: phy: Add a binding for PHY LEDs") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20230420084624.3005701-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-20x86: rewrite '__copy_user_nocache' functionLinus Torvalds
I didn't really want to do this, but as part of all the other changes to the user copy loops, I've been looking at this horror. I tried to clean it up multiple times, but every time I just found more problems, and the way it's written, it's just too hard to fix them. For example, the code is written to do quad-word alignment, and will use regular byte accesses to get to that point. That's fairly simple, but it means that any initial 8-byte alignment will be done with cached copies. However, the code then is very careful to do any 4-byte _tail_ accesses using an uncached 4-byte write, and that was claimed to be relevant in commit a82eee742452 ("x86/uaccess/64: Handle the caching of 4-byte nocache copies properly in __copy_user_nocache()"). So if you do a 4-byte copy using that function, it carefully uses a 4-byte 'movnti' for the destination. But if you were to do a 12-byte copy that is 4-byte aligned, it would _not_ do a 4-byte 'movnti' followed by a 8-byte 'movnti' to keep it all uncached. Instead, it would align the destination to 8 bytes using a byte-at-a-time loop, and then do a 8-byte 'movnti' for the final 8 bytes. The main caller that cares is __copy_user_flushcache(), which knows about this insanity, and has odd cases for it all. But I just can't deal with looking at this kind of "it does one case right, and another related case entirely wrong". And the code really wasn't fixable without hard drugs, which I try to avoid. So instead, rewrite it in a form that hopefully not only gets this right, but is a bit more maintainable. Knock wood. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-04-20Revert "net/mlx5e: Don't use termination table when redundant"Vlad Buslov
This reverts commit 14624d7247fcd0f3114a6f5f17b3c8d1020fbbb7. The termination table usage is requires for DMFS steering mode as firmware doesn't support mixed table destinations list which causes following syndrome with hairpin rules: [81922.283225] mlx5_core 0000:08:00.0: mlx5_cmd_out_err:803:(pid 25977): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0xaca205), err(-22) Fixes: 14624d7247fc ("net/mlx5e: Don't use termination table when redundant") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-20net/mlx5e: Nullify table pointer when failing to createAya Levin
On failing to create promisc flow steering table, the pointer is returned with an error. Nullify it so unloading the driver won't try to destroy a non existing table. Failing to create promisc table may happen over BF devices when the ARM side is going through a firmware tear down. The host side start a reload flow. While the driver unloads, it tries to remove the promisc table. Remove WARN in this state as it is a valid error flow. Fixes: 1c46d7409f30 ("net/mlx5e: Optimize promiscuous mode") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-20net/mlx5: Use recovery timeout on sync reset flowMoshe Shemesh
Use the same timeout for sync reset flow and health recovery flow, since the former involves driver's recovery from firmware reset, which is similar to health recovery. Otherwise, in some cases, such as a firmware upgrade on the DPU, the firmware pre-init bit may not be ready within current timeout and the driver will abort loading back after reset. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Fixes: 37ca95e62ee2 ("net/mlx5: Increase FW pre-init timeout for health recovery") Reviewed-by: Maher Sanalla <msanalla@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-20Revert "net/mlx5: Remove "recovery" arg from mlx5_load_one() function"Moshe Shemesh
This reverts commit 5977ac3910f1cbaf44dca48179118b25c206ac29. Revert this patch as we need the "recovery" arg back in mlx5_load_one() function. This arg will be used in the next patch for using recovery timeout during sync reset flow. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Maher Sanalla <msanalla@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-20net/mlx5e: Fix error flow in representor failing to add vport rx ruleRoi Dayan
On representor init rx error flow the flow steering pointer is being released so mlx5e_attach_netdev() doesn't have a valid fs pointer in its error flow. Make sure the pointer is nullified when released and add a check in mlx5e_fs_cleanup() to verify fs is not null as representor cleanup callback would be called anyway. Fixes: af8bbf730068 ("net/mlx5e: Convert mlx5e_flow_steering member of mlx5e_priv to pointer") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-20net/mlx5: Release tunnel device after tc update skbChris Mi
The cited commit causes a regression. Tunnel device is not released after tc update skb if skb needs to be freed. The following error message will be printed: unregister_netdevice: waiting for vxlan1 to become free. Usage count = 11 Fix it by releasing tunnel device if skb needs to be freed. Fixes: 93a1ab2c545b ("net/mlx5: Refactor tc miss handling to a single function") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-20net/mlx5: E-switch, Don't destroy indirect table in split ruleChris Mi
Source port rewrite (forward to ovs internal port or statck device) isn't supported in the rule of split action. So there is no indirect table in split rule. The cited commit destroyes indirect table in split rule. The indirect table for other rules will be destroyed wrongly. It will cause traffic loss. Fix it by removing the destroy function in split rule. And also remove the destroy function in error flow. Fixes: 10742efc20a4 ("net/mlx5e: VF tunnel TX traffic offloading") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-20net/mlx5: E-switch, Create per vport table based on devlink encap modeChris Mi
Currently when creating per vport table, create flags are hardcoded. Devlink encap mode is set based on user input and HW capability. Create per vport table based on devlink encap mode. Fixes: c796bb7cd230 ("net/mlx5: E-switch, Generalize per vport table API") Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-20net/mlx5e: Release the label when replacing existing ct entryVlad Buslov
Cited commit doesn't release the label mapping when replacing existing ct entry which leads to following memleak report: unreferenced object 0xffff8881854cf280 (size 96): comm "kworker/u48:74", pid 23093, jiffies 4296664564 (age 175.944s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<000000002722d368>] __kmalloc+0x4b/0x1c0 [<00000000cc44e18f>] mapping_add+0x6e8/0xc90 [mlx5_core] [<000000003ad942a7>] mlx5_get_label_mapping+0x66/0xe0 [mlx5_core] [<00000000266308ac>] mlx5_tc_ct_entry_create_mod_hdr+0x1c4/0xf50 [mlx5_core] [<000000009a768b4f>] mlx5_tc_ct_entry_add_rule+0x16f/0xaf0 [mlx5_core] [<00000000a178f3e5>] mlx5_tc_ct_block_flow_offload_add+0x10cb/0x1f90 [mlx5_core] [<000000007b46c496>] mlx5_tc_ct_block_flow_offload+0x14a/0x630 [mlx5_core] [<00000000a9a18ac5>] nf_flow_offload_tuple+0x1a3/0x390 [nf_flow_table] [<00000000d0881951>] flow_offload_work_handler+0x257/0xd30 [nf_flow_table] [<000000009e4935a4>] process_one_work+0x7c2/0x13e0 [<00000000f5cd36a7>] worker_thread+0x59d/0xec0 [<00000000baed1daf>] kthread+0x28f/0x330 [<0000000063d282a4>] ret_from_fork+0x1f/0x30 Fix the issue by correctly releasing the label mapping. Fixes: 94ceffb48eac ("net/mlx5e: Implement CT entry update") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-20net/mlx5e: Don't clone flow post action attributes second timeVlad Buslov
The code already clones post action attributes in mlx5e_clone_flow_attr_for_post_act(). Creating another copy in mlx5e_tc_post_act_add() is a erroneous leftover from original implementation. Instead, assign handle->attribute to post_attr provided by the caller. Note that cloning the attribute second time is not just wasteful but also causes issues like second copy not being properly updated in neigh update code which leads to following use-after-free: Feb 21 09:02:00 c-237-177-40-045 kernel: BUG: KASAN: use-after-free in mlx5_cmd_set_fte+0x200d/0x24c0 [mlx5_core] Feb 21 09:02:00 c-237-177-40-045 kernel: kasan_report+0xbb/0x1a0 Feb 21 09:02:00 c-237-177-40-045 kernel: kasan_save_stack+0x1e/0x40 Feb 21 09:02:00 c-237-177-40-045 kernel: kasan_set_track+0x21/0x30 Feb 21 09:02:00 c-237-177-40-045 kernel: __kasan_kmalloc+0x7a/0x90 Feb 21 09:02:00 c-237-177-40-045 kernel: kasan_save_stack+0x1e/0x40 Feb 21 09:02:00 c-237-177-40-045 kernel: kasan_set_track+0x21/0x30 Feb 21 09:02:00 c-237-177-40-045 kernel: kasan_save_free_info+0x2a/0x40 Feb 21 09:02:00 c-237-177-40-045 kernel: ____kasan_slab_free+0x11a/0x1b0 Feb 21 09:02:00 c-237-177-40-045 kernel: page dumped because: kasan: bad access detected Feb 21 09:02:00 c-237-177-40-045 kernel: mlx5_core 0000:08:00.0: mlx5_cmd_out_err:803:(pid 8833): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad resource state(0x9), syndrome (0xf2ff71), err(-22) Feb 21 09:02:00 c-237-177-40-045 kernel: mlx5_core 0000:08:00.0 enp8s0f0: Failed to add post action rule Feb 21 09:02:00 c-237-177-40-045 kernel: mlx5_core 0000:08:00.0: mlx5e_tc_encap_flows_add:190:(pid 8833): Failed to update flow post acts, -22 Feb 21 09:02:00 c-237-177-40-045 kernel: Call Trace: Feb 21 09:02:00 c-237-177-40-045 kernel: <TASK> Feb 21 09:02:00 c-237-177-40-045 kernel: dump_stack_lvl+0x57/0x7d Feb 21 09:02:00 c-237-177-40-045 kernel: print_report+0x170/0x471 Feb 21 09:02:00 c-237-177-40-045 kernel: ? mlx5_cmd_set_fte+0x200d/0x24c0 [mlx5_core] Feb 21 09:02:00 c-237-177-40-045 kernel: kasan_report+0xbb/0x1a0 Feb 21 09:02:00 c-237-177-40-045 kernel: ? mlx5_cmd_set_fte+0x200d/0x24c0 [mlx5_core] Feb 21 09:02:00 c-237-177-40-045 kernel: mlx5_cmd_set_fte+0x200d/0x24c0 [mlx5_core] Feb 21 09:02:00 c-237-177-40-045 kernel: ? __module_address.part.0+0x62/0x200 Feb 21 09:02:00 c-237-177-40-045 kernel: ? mlx5_cmd_stub_create_flow_table+0xd0/0xd0 [mlx5_core] Feb 21 09:02:00 c-237-177-40-045 kernel: ? __raw_spin_lock_init+0x3b/0x110 Feb 21 09:02:00 c-237-177-40-045 kernel: mlx5_cmd_create_fte+0x80/0xb0 [mlx5_core] Feb 21 09:02:00 c-237-177-40-045 kernel: add_rule_fg+0xe80/0x19c0 [mlx5_core] -- Feb 21 09:02:00 c-237-177-40-045 kernel: Allocated by task 13476: Feb 21 09:02:00 c-237-177-40-045 kernel: kasan_save_stack+0x1e/0x40 Feb 21 09:02:00 c-237-177-40-045 kernel: kasan_set_track+0x21/0x30 Feb 21 09:02:00 c-237-177-40-045 kernel: __kasan_kmalloc+0x7a/0x90 Feb 21 09:02:00 c-237-177-40-045 kernel: mlx5_packet_reformat_alloc+0x7b/0x230 [mlx5_core] Feb 21 09:02:00 c-237-177-40-045 kernel: mlx5e_tc_tun_create_header_ipv4+0x977/0xf10 [mlx5_core] Feb 21 09:02:00 c-237-177-40-045 kernel: mlx5e_attach_encap+0x15b4/0x1e10 [mlx5_core] Feb 21 09:02:00 c-237-177-40-045 kernel: post_process_attr+0x305/0xa30 [mlx5_core] Feb 21 09:02:00 c-237-177-40-045 kernel: mlx5e_tc_add_fdb_flow+0x4c0/0xcf0 [mlx5_core] Feb 21 09:02:00 c-237-177-40-045 kernel: __mlx5e_add_fdb_flow+0x7cf/0xe90 [mlx5_core] Feb 21 09:02:00 c-237-177-40-045 kernel: mlx5e_configure_flower+0xcaa/0x4b90 [mlx5_core] Feb 21 09:02:00 c-237-177-40-045 kernel: mlx5e_rep_setup_tc_cls_flower+0x99/0x1b0 [mlx5_core] Feb 21 09:02:00 c-237-177-40-045 kernel: mlx5e_rep_setup_tc_cb+0x133/0x1e0 [mlx5_core] -- Feb 21 09:02:00 c-237-177-40-045 kernel: Freed by task 8833: Feb 21 09:02:00 c-237-177-40-045 kernel: kasan_save_stack+0x1e/0x40 Feb 21 09:02:00 c-237-177-40-045 kernel: kasan_set_track+0x21/0x30 Feb 21 09:02:00 c-237-177-40-045 kernel: kasan_save_free_info+0x2a/0x40 Feb 21 09:02:00 c-237-177-40-045 kernel: ____kasan_slab_free+0x11a/0x1b0 Feb 21 09:02:00 c-237-177-40-045 kernel: __kmem_cache_free+0x1de/0x400 Feb 21 09:02:00 c-237-177-40-045 kernel: mlx5_packet_reformat_dealloc+0xad/0x100 [mlx5_core] Feb 21 09:02:00 c-237-177-40-045 kernel: mlx5e_tc_encap_flows_del+0x3c0/0x500 [mlx5_core] Feb 21 09:02:00 c-237-177-40-045 kernel: mlx5e_rep_update_flows+0x40c/0xa80 [mlx5_core] Feb 21 09:02:00 c-237-177-40-045 kernel: mlx5e_rep_neigh_update+0x473/0x7a0 [mlx5_core] Feb 21 09:02:00 c-237-177-40-045 kernel: process_one_work+0x7c2/0x1310 Feb 21 09:02:00 c-237-177-40-045 kernel: worker_thread+0x59d/0xec0 Feb 21 09:02:00 c-237-177-40-045 kernel: kthread+0x28f/0x330 Fixes: 8300f225268b ("net/mlx5e: Create new flow attr for multi table actions") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-20net/mlx5: Update op_mode to op_mod for port selectionRoi Dayan
To be consistent with the other enum keys use OP_MOD instead of OP_MODE. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-20net/mlx5: E-Switch, Remove unused mlx5_esw_offloads_vport_metadata_set()Roi Dayan
Remove unused function which also seems a duplicate of esw_port_metadata_set(). Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-20net/mlx5: E-Switch, Remove redundant dev arg from mlx5_esw_vport_alloc()Roi Dayan
The passded esw->dev is redundant as esw being passed and esw->dev being used inside. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-20net/mlx5: Include linux/pci.h for pci_msix_can_alloc_dyn()Eli Cohen
Add include directive to assure pci_msix_can_alloc_dyn() prototype. Fixes: 3354822cde5a ("net/mlx5: Use dynamic msix vectors allocation") Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202303291328.sNmTyyWF-lkp@intel.com/ Signed-off-by: Eli Cohen <elic@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-20net/mlx5e: RX, Hook NAPIs to page poolsDragos Tatulea
Linking the NAPI to the rq page_pool to improve page_pool cache usage during skb recycling. Here are the observed improvements for a iperf single stream test case: - For 1500 MTU and legacy rq, seeing a 20% improvement of cache usage. - For 9K MTU, seeing 33-40 % page_pool cache usage improvements for both striding and legacy rq (depending if the application is running on the same core as the rq or not). Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-20net/mlx5e: RX, Fix XDP_TX page release for legacy rq nonlinear caseDragos Tatulea
When the XDP handler marks the data for transmission (XDP_TX), it is incorrect to release the page fragment. Instead, the fragments should be marked as MLX5E_WQE_FRAG_SKIP_RELEASE because XDP will release the page directly to the page_pool (page_pool_put_defragged_page) after TX completion. The linear case already does this. This patch fixes the nonlinear part as well. Also, the looping over the fragments was incorrect: When handling pages after XDP_TX in the legacy rq nonlinear case, the loop was skipping the first wqe fragment. Fixes: 3f93f82988bc ("net/mlx5e: RX, Defer page release in legacy rq for better recycling") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-20net/mlx5e: RX, Fix releasing page_pool pages twice for striding RQDragos Tatulea
mlx5e_free_rx_descs is responsible for calling the dealloc_wqe op which returns pages to the page_pool. This can happen during flush or close. For XSK, the regular RQ is flushed (when replaced by the XSK RQ) and also closed later. This is normally not a problem as the wqe list is empty on a second call to mlx5e_free_rx_descs. However, for striding RQ, the previously released wqes from the list will appear as missing and will be released a second time by mlx5e_free_rx_missing_descs. This patch sets the no release bits on the striding RQ wqes in the dealloc_wqe op to prevent releasing the pages a second time. Please note that the bits are set only in the control path during close and not in the data path. Fixes: 4c2a13236807 ("net/mlx5e: RX, Defer page release in striding rq for better recycling") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-20net/mlx5e: Add vnic devlink health reporter to representorsMaher Sanalla
Create a new devlink health reporter for representor interface, which reports the values of representor vnic diagnostic counters when diagnosed. This patch will allow admins to monitor VF diagnostic counters through the representor-interface vnic reporter. Example of usage: $ devlink health diagnose pci/0000:08:00.0/65537 reporter vnic vNIC env counters: total_error_queues: 0 send_queue_priority_update_flow: 0 comp_eq_overrun: 0 async_eq_overrun: 0 cq_overrun: 0 invalid_command: 0 quota_exceeded_command: 0 nic_receive_steering_discard: 0 Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-20net/mlx5: Add vnic devlink health reporter to PFs/VFsMaher Sanalla
Create a vnic devlink health reporter for PFs/VFs interfaces. The reporter's diagnose callback displays the values of vNIC/vport transport debug counters of PFs/VFs, as follows: $ devlink health diagnose pci/0000:08:00.0 reporter vnic vNIC env counters: total_error_queues: 0 send_queue_priority_update_flow: 0 comp_eq_overrun: 0 async_eq_overrun: 0 cq_overrun: 0 invalid_command: 0 quota_exceeded_command: 0 nic_receive_steering_discard: 0 Moreover, add documentation on the reporter functionality and the counters description. While at it, expose the vNIC counters diagnose function to be used by the downstream patch, which will reveal the counters for representor interfaces. Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-20Revert "net/mlx5: Expose vnic diagnostic counters for eswitch managed vports"Maher Sanalla
This reverts commit 606e6a72e29dff9e3341c4cc9b554420e4793f401 which exposes the vnic diagnostic counters via debugfs. Instead, The upcoming series will expose the same counters through devlink health reporter. Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-20Revert "net/mlx5: Expose steering dropped packets counter"Maher Sanalla
This reverts commit 4fe1b3a5f8fe2fdcedcaba9561e5b0ae5cb1d15b, which exposes the steering dropped packets counter via debugfs. The upcoming series will expose the counter via devlink health reporter instead of debugfs. Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-20net/mlx5: DR, Add memory statistics for domain objectYevgeny Kliteynik
Add counters for number of buddies that are currently in use per domain per buddy type (STE, MODIFY-HEADER, MODIFY-PATTERN). Signed-off-by: Erez Shitrit <erezsh@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-20net/mlx5: DR, Add more info in domain dbg dumpYevgeny Kliteynik
Add additinal items to domain info dump: Linux version and device name. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-20net/mlx5: DR, Calculate sync threshold of each pool according to its typeYevgeny Kliteynik
When certain ICM chunk is no longer needed, it needs to be freed. Fully freeing ICM memory involves issuing FW SYNC_STEERING command. This is very time consuming, and it is impractical to do it for every freed chunk. Instead, we manage these 'freed' chunks in hot list (list of chunks that are not required by SW any more, but HW might still access them). When size of the hot list reaches certain threshold, we purge it and issue SYNC_STEERING FW command. There is one threshold for all the different ICM types, which is not optimal, as different ICM types require different approach: STEs pool is very large, and it is very 'dynamic' in its nature, so letting hot list to become too large will result in a significant perf hiccup when purging the hot list. Modify action is much smaller and less dynamic, so we can let the hot list to grow to almost the size of the whole pool. This patch fixes this problem: instead of having same hot memory threshold for all the pools, sync operation will be triggered in accordance with the ICM type. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-20net/mlx5: DR, Fix dumping of legacy modify_hdr in debug dumpYevgeny Kliteynik
The steering dump parser expects to see 0 as rewrite num of actions in case pattern/args aren't supported - parsing of legacy modify header is based on this assumption. Fix this to align to parser's expectation. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-04-21Merge tag 'amd-drm-fixes-6.3-2023-04-19' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.3-2023-04-19: amdgpu: - GPU reset fix - DCN 3.1.5 line buffer fix - Display fix for single channel memory configs - Fix a possible divide by 0 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230420031717.7790-1-alexander.deucher@amd.com
2023-04-21Merge tag 'drm-intel-fixes-2023-04-19' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes drm/i915 fixes for v6.3 final: - Fix fast wake AUX sync len Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/87354w1b76.fsf@intel.com
2023-04-21Merge tag 'drm-misc-fixes-2023-04-20-2' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Short summary of fixes pull: * nouveau: fix dma-resv timeout * rockchip: fix suspend/resume * sched: fix timeout handling Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20230420083114.GA17651@linux-uq9g
2023-04-20docs: clk: add documentation to log which clocks have been disabledBrian Masney
The existing clk documentation has a section that talks about the clk_ignore_unused kernel parameter. Add additional documentation that describes how to log which clocks the kernel disables on bootup. This will log messages like the following to the console on bootup: [ 1.268115] clk: Disabling unused clocks [ 1.272167] clk_disable: gcc_usb_clkref_en [ 1.276389] clk_disable: gcc_usb30_sec_sleep_clk [ 1.281131] clk_disable: gcc_usb30_prim_sleep_clk ... Signed-off-by: Brian Masney <bmasney@redhat.com> Link: https://lore.kernel.org/r/20230411192153.289688-1-bmasney@redhat.com [jc: turned parameters into a literal block] Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-04-20docs: trace: Fix typo in ftrace.rstLin Yu Chen
There is a typo in the sentence "A kernel developer must be conscience ...". The word conscience should be conscious. This patch fixes it. Signed-off-by: Lin Yu Chen <starpt.official@gmail.com> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20230412183739.89894-1-starpt.official@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-04-20Documentation/process: always CC responsible listsKrzysztof Kozlowski
The "Select the recipients for your patch" part about CC-ing mailing lists is a bit vague and might be understood that only some lists should be Cc-ed. That's not what most of the maintainers expect. For given code, associated mailing list must always be CC-ed, because the list is used for reviewing and testing patches. Example are the Devicetree bindings patches, which are tested iff Devicetree mailing list is CC-ed. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20230413165501.47442-1-krzysztof.kozlowski@linaro.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-04-20docs: kmemleak: adjust to config renamingLukas Bulwahn
Commit c87db8ca0902 ("kmemleak-test: fix kmemleak_test.c build logic") essentially renames the config DEBUG_KMEMLEAK_TEST to SAMPLE_KMEMLEAK, but misses to adjust the documentation. Adjust kmemleak documentation to this config renaming. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20230414061241.12754-1-lukas.bulwahn@gmail.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-04-20ELF: document some de-facto PT_* ABI quirksAlexey Dobriyan
Turns out rules about PT_INTERP, PT_GNU_STACK and PT_GNU_PROPERTY program headers are slightly different. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Link: https://lore.kernel.org/r/88d3f1bb-f4e0-4c40-9304-3843513a1262@p183 Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-04-20Documentation: arm: remove stih415/stih416 related entriesAlain Volmat
ST's STiH415 and STiH416 platforms support have been removed since a long time already. This commit updates the sti related documentation overview to remove related entries and update the sti part to add STiH407/STiH410 and STiH418 platforms which are still actively supported. Signed-off-by: Alain Volmat <avolmat@me.com> Link: https://lore.kernel.org/r/20230416185349.18156-1-avolmat@me.com Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-04-20docs: turn off "smart quotes" in the HTML buildJonathan Corbet
We have long disabled the "html_use_smartypants" option to prevent Sphinx from mangling "--" sequences (among others). Unfortunately, Sphinx changed that option to "smartquotes" in the 1.6.6 release, and seemingly didn't see fit to warn about the use of the obsolete option, resulting in the aforementioned mangling returning. Disable this behavior again and hope that the option name stays stable for a while. Reported-by: Zipeng Zhang <zhangzipeng0@foxmail.com> Link: https://lore.kernel.org/lkml/tencent_CB1A298D31FD221496FF657CD7EF406E6605@qq.com Reviewed-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-04-20Merge branch 'fix __retval() being always ignored'Alexei Starovoitov
Eduard Zingerman says: ==================== Florian Westphal found a bug in test_loader.c processing of __retval tag. Because of this bug the function test_loader.c:do_prog_test_run() never executed and all __retval test tags were ignored. See [1]. Fix for this bug uncovers two additional bugs: - During test_verifier tests migration to inline assembly (see [2]) I missed the fact that some tests require maps to contain mock values; - Some issue with a new refcounted_kptr test, which causes kernel to produce dead lock and refcount saturation warnings when subject to libbpf's bpf_test_run_opts(). This series fixes the bug in __retval() processing, and address the issue with test maps not being populated. The issue in refcounted_kptr is not addressed, __retval tags in those tests are commented out. I found that the following tests depend on test maps being populated: - progs/verifier_array_access.c - verifier/value_ptr_arith.c (planned for migration to inline assembly) Given the small amount of these tests I decided to opt for simple non-generic solution (see patch #4). [1] https://lore.kernel.org/bpf/f4c4aee644425842ee6aa8edf1da68f0a8260e7c.camel@gmail.com/T/ [2] https://lore.kernel.org/bpf/20230325025524.144043-1-eddyz87@gmail.com/ ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-20selftests/bpf: populate map_array_ro map for verifier_array_access testEduard Zingerman
Two test cases: - "valid read map access into a read-only array 1" and - "valid read map access into a read-only array 2" Expect that map_array_ro map is filled with mock data. This logic was not taken into acount during initial test conversion. This commit modifies prog_tests/verifier.c entry point for this test to fill the map. Fixes: a3c830ae0209 ("selftests/bpf: verifier/array_access.c converted to inline assembly") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230420232317.2181776-5-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-20selftests/bpf: add pre bpf_prog_test_run_opts() callback for test_loaderEduard Zingerman
When a test case is annotated with __retval tag the test_loader engine would use libbpf's bpf_prog_test_run_opts() to do a test run of the program and compare retvals. This commit allows to perform arbitrary actions on bpf object right before test loader invokes bpf_prog_test_run_opts(). This could be used to setup some state for program execution, e.g. fill some maps. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230420232317.2181776-4-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-20selftests/bpf: fix __retval() being always ignoredEduard Zingerman
Florian Westphal found a bug in and suggested a fix for test_loader.c processing of __retval tag. Because of this bug the function test_loader.c:do_prog_test_run() never executed and all __retval test tags were ignored. If this bug is fixed a number of test cases from progs/verifier_array_access.c fail with retval not matching the expected value. This test was recently converted to use test_loader.c and inline assembly in [1]. When doing the conversion I missed the important detail of test_verifier.c operation: when it creates fixup_map_array_ro, fixup_map_array_wo and fixup_map_array_small it populates these maps with a dummy record. Disabling the __retval checks for the affected verifier_array_access in this commit to avoid false-postivies in any potential bisects. The issue is addressed in the next patch. I verified that the __retval tags are now respected by changing expected return values for all tests annotated with __retval, and checking that these tests started to fail. [1] https://lore.kernel.org/bpf/20230325025524.144043-1-eddyz87@gmail.com/ Fixes: 19a8e06f5f91 ("selftests/bpf: Tests execution support for test_loader.c") Reported-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/bpf/f4c4aee644425842ee6aa8edf1da68f0a8260e7c.camel@gmail.com/T/ Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230420232317.2181776-3-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-04-20selftests/bpf: disable program test run for progs/refcounted_kptr.cEduard Zingerman
Florian Westphal found a bug in test_loader.c processing of __retval tag. Because of this bug the function test_loader.c:do_prog_test_run() never executed and all __retval test tags were ignored. This hid an issue with progs/refcounted_kptr.c tests. When __retval tag bug is fixed and refcounted_kptr.c tests are run kernel reports various issues and eventually hangs. Shortest reproducer is the following command run a few times: $ for i in $(seq 1 4); do (./test_progs --allow=refcounted_kptr &); done Commenting out __retval tags for these tests until this issue is resolved. Reported-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/bpf/f4c4aee644425842ee6aa8edf1da68f0a8260e7c.camel@gmail.com/T/ Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20230420232317.2181776-2-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>