summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-09-09idpf: convert to libeth Tx buffer completionAlexander Lobakin
&idpf_tx_buffer is almost identical to the previous generations, as well as the way it's handled. Moreover, relying on dma_unmap_addr() and !!buf->skb instead of explicit defining of buffer's type was never good. Use the newly added libeth helpers to do it properly and reduce the copy-paste around the Tx code. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-09-09libeth: add Tx buffer completion helpersAlexander Lobakin
Software-side Tx buffers for storing DMA, frame size, skb pointers etc. are pretty much generic and every driver defines them the same way. The same can be said for software Tx completions -- same napi_consume_skb()s and all that... Add a couple simple wrappers for doing that to stop repeating the old tale at least within the Intel code. Drivers are free to use 'priv' member at the end of the structure. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-09-09tracing: Drop unused helper function to fix the buildAndy Shevchenko
A helper function defined but not used. This, in particular, prevents kernel builds with clang, `make W=1` and CONFIG_WERROR=y: kernel/trace/trace.c:2229:19: error: unused function 'run_tracer_selftest' [-Werror,-Wunused-function] 2229 | static inline int run_tracer_selftest(struct tracer *type) | ^~~~~~~~~~~~~~~~~~~ Fix this by dropping unused functions. See also commit 6863f5643dd7 ("kbuild: allow Clang to find unused static inline functions for W=1 build"). Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Bill Wendling <morbo@google.com> Cc: Justin Stitt <justinstitt@google.com> Link: https://lore.kernel.org/20240909105314.928302-1-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-09-09tracing/osnoise: Fix build when timerlat is not enabledSteven Rostedt
To fix some critical section races, the interface_lock was added to a few locations. One of those locations was above where the interface_lock was declared, so the declaration was moved up before that usage. Unfortunately, where it was placed was inside a CONFIG_TIMERLAT_TRACER ifdef block. As the interface_lock is used outside that config, this broke the build when CONFIG_OSNOISE_TRACER was enabled but CONFIG_TIMERLAT_TRACER was not. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: "Helena Anna" <helena.anna.dubel@intel.com> Cc: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com> Cc: Tomas Glozar <tglozar@redhat.com> Link: https://lore.kernel.org/20240909103231.23a289e2@gandalf.local.home Fixes: e6a53481da29 ("tracing/timerlat: Only clear timer if a kthread exists") Reported-by: "Bityutskiy, Artem" <artem.bityutskiy@intel.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-09-09net/mlx5: Fix bridge mode operations when there are no VFsBenjamin Poirier
Currently, trying to set the bridge mode attribute when numvfs=0 leads to a crash: bridge link set dev eth2 hwmode vepa [ 168.967392] BUG: kernel NULL pointer dereference, address: 0000000000000030 [...] [ 168.969989] RIP: 0010:mlx5_add_flow_rules+0x1f/0x300 [mlx5_core] [...] [ 168.976037] Call Trace: [ 168.976188] <TASK> [ 168.978620] _mlx5_eswitch_set_vepa_locked+0x113/0x230 [mlx5_core] [ 168.979074] mlx5_eswitch_set_vepa+0x7f/0xa0 [mlx5_core] [ 168.979471] rtnl_bridge_setlink+0xe9/0x1f0 [ 168.979714] rtnetlink_rcv_msg+0x159/0x400 [ 168.980451] netlink_rcv_skb+0x54/0x100 [ 168.980675] netlink_unicast+0x241/0x360 [ 168.980918] netlink_sendmsg+0x1f6/0x430 [ 168.981162] ____sys_sendmsg+0x3bb/0x3f0 [ 168.982155] ___sys_sendmsg+0x88/0xd0 [ 168.985036] __sys_sendmsg+0x59/0xa0 [ 168.985477] do_syscall_64+0x79/0x150 [ 168.987273] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 168.987773] RIP: 0033:0x7f8f7950f917 (esw->fdb_table.legacy.vepa_fdb is null) The bridge mode is only relevant when there are multiple functions per port. Therefore, prevent setting and getting this setting when there are no VFs. Note that after this change, there are no settings to change on the PF interface using `bridge link` when there are no VFs, so the interface no longer appears in the `bridge link` output. Fixes: 4b89251de024 ("net/mlx5: Support ndo bridge_setlink and getlink") Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5: Verify support for scheduling element and TSAR typeCarolina Jubran
Before creating a scheduling element in a NIC or E-Switch scheduler, ensure that the requested element type is supported. If the element is of type Transmit Scheduling Arbiter (TSAR), also verify that the specific TSAR type is supported. Fixes: 214baf22870c ("net/mlx5e: Support HTB offload") Fixes: 85c5f7c9200e ("net/mlx5: E-switch, Create QoS on demand") Fixes: 0fe132eac38c ("net/mlx5: E-switch, Allow to add vports to rate groups") Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5: Add missing masks and QoS bit masks for scheduling elementsCarolina Jubran
Add the missing masks for supported element types and Transmit Scheduling Arbiter (TSAR) types in scheduling elements. Also, add the corresponding bit masks for these types in the QoS capabilities of a NIC scheduler. Fixes: 214baf22870c ("net/mlx5e: Support HTB offload") Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5: Explicitly set scheduling element and TSAR typeCarolina Jubran
Ensure the scheduling element type and TSAR type are explicitly initialized in the QoS rate group creation. This prevents potential issues due to default values. Fixes: 1ae258f8b343 ("net/mlx5: E-switch, Introduce rate limiting groups API") Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5e: Add missing link mode to ptys2ext_ethtool_mapShahar Shitrit
Add MLX5E_400GAUI_8_400GBASE_CR8 to the extended modes in ptys2ext_ethtool_table, since it was missing. Fixes: 6a897372417e ("net/mlx5: ethtool, Add ethtool support for 50Gbps per lane link modes") Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5e: Add missing link modes to ptys2ethtool_mapShahar Shitrit
Add MLX5E_1000BASE_T and MLX5E_100BASE_TX to the legacy modes in ptys2legacy_ethtool_table, since they were missing. Fixes: 665bc53969d7 ("net/mlx5e: Use new ethtool get/set link ksettings API") Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Carolina Jubran <cjubran@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5: Update the list of the PCI supported devicesMaher Sanalla
Add the upcoming ConnectX-9 device ID to the table of supported PCI device IDs. Fixes: f908a35b2218 ("net/mlx5: Update the list of the PCI supported devices") Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5: HWS, added API and enabled HWS supportYevgeny Kliteynik
Enabling HWS support in the mlx5 driver: - added HWS API header - added HWS files in the mlx5 driver makefile - added kconfig flag that enables HWS compilation Reviewed-by: Erez Shitrit <erezsh@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5: HWS, added send engine and context handlingYevgeny Kliteynik
Added implementation of send engine and handling of HWS context. Reviewed-by: Itamar Gozlan <igozlan@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5: HWS, added debug dump and internal headersYevgeny Kliteynik
Added debug dump of the existing HWS state, and all the required internal definitions. To dump the HWS state, cat the following debugfs node: cat /sys/kernel/debug/mlx5/<PCI>/steering/fdb/ctx_<ctx_id> Reviewed-by: Hamdan Agbariya <hamdani@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5: HWS, added backward-compatible API handlingYevgeny Kliteynik
Added implementation of backward-compatible (BWC) steering API. Native HWS API is very different from SWS API: - SWS is synchronous (rule creation/deletion API call returns when the rule is created/deleted), while HWS is asynchronous (it requires polling for completion in order to know when the rule creation/deletion happened) - SWS manages its own memory (it allocates/frees all the needed memory for steering rules, while HWS requires the rules memory to be allocated/freed outside the API In order to make HWS fit the existing fs-core steering API paradigm, this patch adds implementation of backward-compatible (BWC) steering API that has the bahaviour similar to SWS: among others, it encompasses all the rules' memory management and completion polling, presenting the usual synchronous API for the upper layer. A user that wishes to utilize the full speed potential of HWS can call the HWS async API and have rule insertion/deletion batching, lower memory management overhead, and lower CPU utilization. Such approach will be taken by the future Connection Tracking. Note that BWC steering doesn't support yet rules that require more than one match STE - complex rules. This support will be added later on. Reviewed-by: Erez Shitrit <erezsh@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5: HWS, added memory management handlingYevgeny Kliteynik
Added object pools and buddy allocator functionality. Reviewed-by: Itamar Gozlan <igozlan@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5: HWS, added vport handlingYevgeny Kliteynik
Vport is a virtual eswitch port that is associated with its virtual function (VF), physical function (PF) or sub-function (SF). This patch adds handling of vports in HWS. Reviewed-by: Hamdan Agbariya <hamdani@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5: HWS, added modify header pattern and args handlingYevgeny Kliteynik
Packet headers/metadta manipulations are split into two parts: - Header Modify Pattern: an object that describes which fields will be modified and in which way - Header Modify Argument: an object that provides the values to be used for header modification Reviewed-by: Hamdan Agbariya <hamdani@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5: HWS, added FW commands handlingYevgeny Kliteynik
This patch adds implementation of FW object handling, such as creation/destruction, modification, and querying. Reviewed-by: Hamdan Agbariya <hamdani@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5: HWS, added matchers functionalityYevgeny Kliteynik
Matcher object encompasses all the building blocks that are needed in order to perform flow steering of a given flow: - flow table that serves as entering point of this matcher - Rule Table Context (RTC) objects to hold ll the Steering Table Entries (STEs), both for matching the flow and for performing actions - rules that describe the set of matching parameters for a flow and actions to perform in case of a hit. This patch adds implementation of matchers handling in HWS. Reviewed-by: Itamar Gozlan <igozlan@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5: HWS, added definers handlingYevgeny Kliteynik
The Match Definer combines packet fields and a mask, creating a key which can be used for packet matching during steering flow processing. This patch adds handling of definer objects in HWS. Reviewed-by: Hamdan Agbariya <hamdani@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5: HWS, added rules handlingYevgeny Kliteynik
Steering rule is a concept that includes match parameters for a flow, and actions to perform on the flows that match these parameters. This patch adds rules handling part of HW Steering. Reviewed-by: Itamar Gozlan <igozlan@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5: HWS, added tables handlingYevgeny Kliteynik
Flow tables are SW objects that are comprised of list of matchers, that in turn define the properties of a flow to match on and set of actions to perform on the flows in case of match hit or miss. Reviewed-by: Itamar Gozlan <igozlan@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5: HWS, added actions handlingYevgeny Kliteynik
When a packet matches a flow, the actions specified for the flow are applied. The supported actions include (but not limited to) the following: - drop: packet processing is stopped - go to vport: packet is forwarded to a specified vport - go to flow table: packet is forwarded to a specified table and processing continues there - push/pop vlan: add/remove vlan header respectively to/from the packet - insert/remove header: add/remove a user-defined header to/from the packet - counter: count the packet bytes in the specified counter - tag: tag the matching flow with a provided tag value - reformat: change the packet format by adding or removing some of its headers - modify header: modify the value of the packet headers with set/add/copy ops - range: match packet on range of values Reviewed-by: Erez Shitrit <erezsh@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5: Added missing definitions in preparation for HW SteeringYevgeny Kliteynik
As part of preparation for HWS, added missing definitions in qp.h and fs_core.h: - FS_FT_FDB_RX/TX table types that are used by HWS in addition to an existing FS_FT_FDB - MLX5_WQE_CTRL_INITIATOR_SMALL_FENCE that is used by HWS to require fence in WQE Reviewed-by: Hamdan Agbariya <hamdani@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09net/mlx5: Added missing mlx5_ifc definition for HW SteeringYevgeny Kliteynik
Add mlx5_ifc definitions that are required for HWS support. Note that due to change in the mlx5_ifc_flow_table_context_bits structure that now includes both SWS and HWS bits in a union, this patch also includes small change in one of SWS files that was required for compilation. Reviewed-by: Hamdan Agbariya <hamdani@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09igb: Always call igb_xdp_ring_update_tail() under Tx lockSriram Yagnaraman
Always call igb_xdp_ring_update_tail() under __netif_tx_lock, add a comment and lockdep assert to indicate that. This is needed to share the same TX ring between XDP, XSK and slow paths. Furthermore, the current XDP implementation is racy on tail updates. Fixes: 9cbc948b5a20 ("igb: add XDP support") Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> [Kurt: Add lockdep assert and fixes tag] Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-09-09ice: fix VSI lists confusion when adding VLANsMichal Schmidt
The description of function ice_find_vsi_list_entry says: Search VSI list map with VSI count 1 However, since the blamed commit (see Fixes below), the function no longer checks vsi_count. This causes a problem in ice_add_vlan_internal, where the decision to share VSI lists between filter rules relies on the vsi_count of the found existing VSI list being 1. The reproducing steps: 1. Have a PF and two VFs. There will be a filter rule for VLAN 0, referring to a VSI list containing VSIs: 0 (PF), 2 (VF#0), 3 (VF#1). 2. Add VLAN 1234 to VF#0. ice will make the wrong decision to share the VSI list with the new rule. The wrong behavior may not be immediately apparent, but it can be observed with debug prints. 3. Add VLAN 1234 to VF#1. ice will unshare the VSI list for the VLAN 1234 rule. Due to the earlier bad decision, the newly created VSI list will contain VSIs 0 (PF) and 3 (VF#1), instead of expected 2 (VF#0) and 3 (VF#1). 4. Try pinging a network peer over the VLAN interface on VF#0. This fails. Reproducer script at: https://gitlab.com/mschmidt2/repro/-/blob/master/RHEL-46814/test-vlan-vsi-list-confusion.sh Commented debug trace: https://gitlab.com/mschmidt2/repro/-/blob/master/RHEL-46814/ice-vlan-vsi-lists-debug.txt Patch adding the debug prints: https://gitlab.com/mschmidt2/linux/-/commit/f8a8814623944a45091a77c6094c40bfe726bfdb (Unsafe, by the way. Lacks rule_lock when dumping in ice_remove_vlan.) Michal Swiatkowski added to the explanation that the bug is caused by reusing a VSI list created for VLAN 0. All created VFs' VSIs are added to VLAN 0 filter. When a non-zero VLAN is created on a VF which is already in VLAN 0 (normal case), the VSI list from VLAN 0 is reused. It leads to a problem because all VFs (VSIs to be specific) that are subscribed to VLAN 0 will now receive a new VLAN tag traffic. This is one bug, another is the bug described above. Removing filters from one VF will remove VLAN filter from the previous VF. It happens a VF is reset. Example: - creation of 3 VFs - we have VSI list (used for VLAN 0) [0 (pf), 2 (vf1), 3 (vf2), 4 (vf3)] - we are adding VLAN 100 on VF1, we are reusing the previous list because 2 is there - VLAN traffic works fine, but VLAN 100 tagged traffic can be received on all VSIs from the list (for example broadcast or unicast) - trust is turning on VF2, VF2 is resetting, all filters from VF2 are removed; the VLAN 100 filter is also removed because 3 is on the list - VLAN traffic to VF1 isn't working anymore, there is a need to recreate VLAN interface to readd VLAN filter One thing I'm not certain about is the implications for the LAG feature, which is another caller of ice_find_vsi_list_entry. I don't have a LAG-capable card at hand to test. Fixes: 23ccae5ce15f ("ice: changes to the interface with the HW and FW for SRIOV_VF+LAG") Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Dave Ertman <David.m.ertman@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-09-09ice: stop calling pci_disable_device() as we use pcimPrzemek Kitszel
Our driver uses devres to manage resources, in particular we call pcim_enable_device(), what also means we express the intent to get automatic pci_disable_device() call at driver removal. Manual calls to pci_disable_device() misuse the API. Recent commit (see "Fixes" tag) has changed the removal action from conditional (silent ignore of double call to pci_disable_device()) to unconditional, but able to catch unwanted redundant calls; see cited "Fixes" commit for details. Since that, unloading the driver yields following warn+splat: [70633.628490] ice 0000:af:00.7: disabling already-disabled device [70633.628512] WARNING: CPU: 52 PID: 33890 at drivers/pci/pci.c:2250 pci_disable_device+0xf4/0x100 ... [70633.628744] ? pci_disable_device+0xf4/0x100 [70633.628752] release_nodes+0x4a/0x70 [70633.628759] devres_release_all+0x8b/0xc0 [70633.628768] device_unbind_cleanup+0xe/0x70 [70633.628774] device_release_driver_internal+0x208/0x250 [70633.628781] driver_detach+0x47/0x90 [70633.628786] bus_remove_driver+0x80/0x100 [70633.628791] pci_unregister_driver+0x2a/0xb0 [70633.628799] ice_module_exit+0x11/0x3a [ice] Note that this is the only Intel ethernet driver that needs such fix. Fixes: f748a07a0b64 ("PCI: Remove legacy pcim_release()") Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Philipp Stanner <pstanner@redhat.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-09-09ice: fix accounting for filters shared by multiple VSIsJacob Keller
When adding a switch filter (such as a MAC or VLAN filter), it is expected that the driver will detect the case where the filter already exists, and return -EEXIST. This is used by calling code such as ice_vc_add_mac_addr, and ice_vsi_add_vlan to avoid incrementing the accounting fields such as vsi->num_vlan or vf->num_mac. This logic works correctly for the case where only a single VSI has added a given switch filter. When a second VSI adds the same switch filter, the driver converts the existing filter from an ICE_FWD_TO_VSI filter into an ICE_FWD_TO_VSI_LIST filter. This saves switch resources, by ensuring that multiple VSIs can re-use the same filter. The ice_add_update_vsi_list() function is responsible for doing this conversion. When first converting a filter from the FWD_TO_VSI into FWD_TO_VSI_LIST, it checks if the VSI being added is the same as the existing rule's VSI. In such a case it returns -EEXIST. However, when the switch rule has already been converted to a FWD_TO_VSI_LIST, the logic is different. Adding a new VSI in this case just requires extending the VSI list entry. The logic for checking if the rule already exists in this case returns 0 instead of -EEXIST. This breaks the accounting logic mentioned above, so the counters for how many MAC and VLAN filters exist for a given VF or VSI no longer accurately reflect the actual count. This breaks other code which relies on these counts. In typical usage this primarily affects such filters generally shared by multiple VSIs such as VLAN 0, or broadcast and multicast MAC addresses. Fix this by correctly reporting -EEXIST in the case of adding the same VSI to a switch rule already converted to ICE_FWD_TO_VSI_LIST. Fixes: 9daf8208dd4d ("ice: Add support for switch filter programming") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-09-09ice: Fix lldp packets dropping after changing the number of channelsMartyna Szapar-Mudlaw
After vsi setup refactor commit 6624e780a577 ("ice: split ice_vsi_setup into smaller functions") ice_cfg_sw_lldp function which removes rx rule directing LLDP packets to vsi is moved from ice_vsi_release to ice_vsi_decfg function. ice_vsi_decfg is used in more cases than just in vsi_release resulting in unnecessary removal of rx lldp packets handling switch rule. This leads to lldp packets being dropped after a change number of channels via ethtool. This patch moves ice_cfg_sw_lldp function that removes rx lldp sw rule back to ice_vsi_release function. Fixes: 6624e780a577 ("ice: split ice_vsi_setup into smaller functions") Reported-by: Matěj Grégr <mgregr@netx.as> Closes: https://lore.kernel.org/intel-wired-lan/1be45a76-90af-4813-824f-8398b69745a9@netx.as/T/#u Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@linux.intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-09-09hwmon: (pmbus) Conditionally clear individual status bits for pmbus rev >= 1.2Patryk Biel
The current implementation of pmbus_show_boolean assumes that all devices support write-back operation of status register to clear pending warnings or faults. Since clearing individual bits in the status registers was only introduced in PMBus specification 1.2, this operation may not be supported by some older devices. This can result in an error while reading boolean attributes such as temp1_max_alarm. Fetch PMBus revision supported by the device and modify pmbus_show_boolean so that it only tries to clear individual status bits if the device is compliant with PMBus specs >= 1.2. Otherwise clear all fault indicators on the current page after a fault status was reported. Fixes: 35f165f08950a ("hwmon: (pmbus) Clear pmbus fault/warning bits after read") Signed-off-by: Patryk Biel <pbiel7@gmail.com> Message-ID: <20240909-pmbus-status-reg-clearing-v1-1-f1c0d68c6408@gmail.com> [groeck: Rewrote description Moved revision detection code ahead of clear faults command Assigned revision if return value from PMBUS_REVISION command is 0 Improved return value check from calling _pmbus_write_byte_data()] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-09-09platform/x86: panasonic-laptop: Allocate 1 entry extra in the sinf arrayHans de Goede
Some DSDT-s have an off-by-one bug where the SINF package count is one higher than the SQTY reported value, allocate 1 entry extra. Also make the SQTY <-> SINF package count mismatch error more verbose to help debugging similar issues in the future. This fixes the panasonic-laptop driver failing to probe() on some devices with the following errors: [ 3.958887] SQTY reports bad SINF length SQTY: 37 SINF-pkg-count: 38 [ 3.958892] Couldn't retrieve BIOS data [ 3.983685] Panasonic Laptop Support - With Macros: probe of MAT0019:00 failed with error -5 Fixes: 709ee531c153 ("panasonic-laptop: add Panasonic Let's Note laptop extras driver v0.94") Cc: stable@vger.kernel.org Tested-by: James Harmison <jharmison@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20240909113227.254470-2-hdegoede@redhat.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-09-09platform/x86: panasonic-laptop: Fix SINF array out of bounds accessesHans de Goede
The panasonic laptop code in various places uses the SINF array with index values of 0 - SINF_CUR_BRIGHT(0x0d) without checking that the SINF array is big enough. Not all panasonic laptops have this many SINF array entries, for example the Toughbook CF-18 model only has 10 SINF array entries. So it only supports the AC+DC brightness entries and mute. Check that the SINF array has a minimum size which covers all AC+DC brightness entries and refuse to load if the SINF array is smaller. For higher SINF indexes hide the sysfs attributes when the SINF array does not contain an entry for that attribute, avoiding show()/store() accessing the array out of bounds and add bounds checking to the probe() and resume() code accessing these. Fixes: e424fb8cc4e6 ("panasonic-laptop: avoid overflow in acpi_pcc_hotkey_add()") Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20240909113227.254470-1-hdegoede@redhat.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-09-09Merge tag 'ath-next-20240909' of ↵Kalle Valo
git://git.kernel.org/pub/scm/linux/kernel/git/ath/ath ath.git patches for v6.12 This is once again a fairly light pull request since ath12k is still working on MLO-related changes, and the other drivers are mostly in maintenance mode. ath12k * Fix a frame-larger-than warning seen with debug builds * Fix flex-array-member-not-at-end warnings ath11k * Fix flex-array-member-not-at-end warnings ath9k * Fix a syzbot-reported issue on USB-based devices
2024-09-09Merge tag 'mt76-for-kvalo-2024-09-06' of https://github.com/nbd168/wirelessKalle Valo
mt76 patches for 6.12 - fixes - mt7915 .sta_state support - mt7915 hardware restart improvements
2024-09-09Merge tag 'bcachefs-2024-09-09' of git://evilpiepirate.org/bcachefsLinus Torvalds
Pull bcachefs fixes from Kent Overstreet: - fix ca->io_ref usage; analagous to previous patch doing that for main discard path - cond_resched() in __journal_keys_sort(), cutting down on "hung task" warnings when journal is big - rest of basic BCH_SB_MEMBER_INVALID support - and the critical one: don't delete open files in online fsck, this was causing the "dirent points to inode that doesn't point back" inconsistencies some users were seeing * tag 'bcachefs-2024-09-09' of git://evilpiepirate.org/bcachefs: bcachefs: Don't delete open files in online fsck bcachefs: fix btree_key_cache sysfs knob bcachefs: More BCH_SB_MEMBER_INVALID support bcachefs: Simplify bch2_bkey_drop_ptrs() bcachefs: Add a cond_resched() to __journal_keys_sort() bcachefs: Fix ca->io_ref usage
2024-09-10erofs: fix error handling in z_erofs_init_decompressorSandeep Dhavale
If we get a failure at the first decompressor init (i = 0), the clean up while loop could enter infinite loop due to wrong while check. Check the value of i now to see if we need any clean up at all. Fixes: 5a7cce827ee9 ("erofs: refine z_erofs_{init,exit}_subsystem()") Reported-by: liujinbao1 <liujinbao1@xiaomi.com> Signed-off-by: Sandeep Dhavale <dhavale@google.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/r/20240905060027.2388893-1-dhavale@google.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2024-09-10erofs: clean up erofs_register_sysfs()Gao Xiang
After commit 684b290abc77 ("erofs: add support for FS_IOC_GETFSSYSFSPATH"), `sb->s_sysfs_name` is now valid. Just use it to get rid of duplicated logic. Reviewed-by: Sandeep Dhavale <dhavale@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20240828095232.571946-1-hsiangkao@linux.alibaba.com
2024-09-10erofs: fix incorrect symlink detection in fast symlinkGao Xiang
Fast symlink can be used if the on-disk symlink data is stored in the same block as the on-disk inode, so we don’t need to trigger another I/O for symlink data. However, currently fs correction could be reported _incorrectly_ if inode xattrs are too large. In fact, these should be valid images although they cannot be handled as fast symlinks. Many thanks to Colin for reporting this! Reported-by: Colin Walters <walters@verbum.org> Reported-by: https://honggfuzz.dev/ Link: https://lore.kernel.org/r/bb2dd430-7de0-47da-ae5b-82ab2dd4d945@app.fastmail.com Fixes: 431339ba9042 ("staging: erofs: add inode operations") [ Note that it's a runtime misbehavior instead of a security issue. ] Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20240909031911.1174718-1-hsiangkao@linux.alibaba.com
2024-09-09Merge tag 'devfreq-next-for-6.12' of ↵Rafael J. Wysocki
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux Merge devfreq updates for v6.12 from Chanwoo Choi: "Detailed description for this pull request: - Add missing MODULE_DESCRIPTION() macros for devfreq governors. - Use Use devm_clk_get_enabled() helpers for exyns-bus devfreq driver. - Use of_property_present() instead of of_get_property() for imx-bus devfreq driver." * tag 'devfreq-next-for-6.12' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux: PM / devfreq: imx-bus: Use of_property_present() PM / devfreq: exynos: Use Use devm_clk_get_enabled() helpers PM/devfreq: governor: add missing MODULE_DESCRIPTION() macros
2024-09-09Merge tag 'hyperv-fixes-signed-20240908' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - Add a documentation overview of Confidential Computing VM support (Michael Kelley) - Use lapic timer in a TDX VM without paravisor (Dexuan Cui) - Set X86_FEATURE_TSC_KNOWN_FREQ when Hyper-V provides frequency (Michael Kelley) - Fix a kexec crash due to VP assist page corruption (Anirudh Rayabharam) - Python3 compatibility fix for lsvmbus (Anthony Nandaa) - Misc fixes (Rachel Menge, Roman Kisel, zhang jiao, Hongbo Li) * tag 'hyperv-fixes-signed-20240908' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: hv: vmbus: Constify struct kobj_type and struct attribute_group tools: hv: rm .*.cmd when make clean x86/hyperv: fix kexec crash due to VP assist page corruption Drivers: hv: vmbus: Fix the misplaced function description tools: hv: lsvmbus: change shebang to use python3 x86/hyperv: Set X86_FEATURE_TSC_KNOWN_FREQ when Hyper-V provides frequency Documentation: hyperv: Add overview of Confidential Computing VM support clocksource: hyper-v: Use lapic timer in a TDX VM without paravisor Drivers: hv: Remove deprecated hv_fcopy declarations
2024-09-09security: Update file_set_fowner documentationMickaël Salaün
Highlight that the file_set_fowner hook is now called with a lock held. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Casey Schaufler <casey@schaufler-ca.com> Cc: Christian Brauner <brauner@kernel.org> Cc: James Morris <jmorris@namei.org> Cc: Jann Horn <jannh@google.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Serge E. Hallyn <serge@hallyn.com> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Mickaël Salaün <mic@digikod.net> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-09-09fs: Fix file_set_fowner LSM hook inconsistenciesMickaël Salaün
The fcntl's F_SETOWN command sets the process that handle SIGIO/SIGURG for the related file descriptor. Before this change, the file_set_fowner LSM hook was always called, ignoring the VFS logic which may not actually change the process that handles SIGIO (e.g. TUN, TTY, dnotify), nor update the related UID/EUID. Moreover, because security_file_set_fowner() was called without lock (e.g. f_owner.lock), concurrent F_SETOWN commands could result to a race condition and inconsistent LSM states (e.g. SELinux's fown_sid) compared to struct fown_struct's UID/EUID. This change makes sure the LSM states are always in sync with the VFS state by moving the security_file_set_fowner() call close to the UID/EUID updates and using the same f_owner.lock . Rename f_modown() to __f_setown() to simplify code. Cc: stable@vger.kernel.org Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Casey Schaufler <casey@schaufler-ca.com> Cc: Christian Brauner <brauner@kernel.org> Cc: James Morris <jmorris@namei.org> Cc: Jann Horn <jannh@google.com> Cc: Ondrej Mosnacek <omosnace@redhat.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Serge E. Hallyn <serge@hallyn.com> Cc: Stephen Smalley <stephen.smalley.work@gmail.com> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Mickaël Salaün <mic@digikod.net> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-09-09Merge branch 'thermal-core'Rafael J. Wysocki
Merge thermal core fixes and cleanups for 6.12: - Refuse to accept trip point temperature or hysteresis that would lead to an invalid threshold value when setting them via sysfs (Rafael Wysocki). - Adjust states of all uninitialized instances in the .manage() callback of the Bang-bang thermal governor (Rafael Wysocki). - Drop a couple of redundant checks along with the code depending on them from the thermal core (Rafael Wysocki). - Rearrange the thermal core to avoid redundant checks and simplify control flow in a couple of code paths (Rafael Wysocki). * thermal-core: thermal: core: Drop thermal_zone_device_is_enabled() thermal: core: Check passive delay in monitor_thermal_zone() thermal: core: Drop dead code from monitor_thermal_zone() thermal: core: Drop redundant lockdep_assert_held() thermal: gov_bang_bang: Adjust states of all uninitialized instances thermal: sysfs: Add sanity checks for trip temperature and hysteresis
2024-09-09printk: Export match_devname_and_update_preferred_console()Yu Liao
When building serial_base as a module, modpost fails with the following error message: ERROR: modpost: "match_devname_and_update_preferred_console" [drivers/tty/serial/serial_base.ko] undefined! Export the symbol to allow using it from modules. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202409071312.qlwtTOS1-lkp@intel.com/ Fixes: 12c91cec3155 ("serial: core: Add serial_base_match_and_update_preferred_console()") Signed-off-by: Yu Liao <liaoyu15@huawei.com> Link: https://lore.kernel.org/r/20240909075652.747370-1-liaoyu15@huawei.com Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-09-09io_uring/sqpoll: do not allow pinning outside of cpusetFelix Moessbauer
The submit queue polling threads are userland threads that just never exit to the userland. When creating the thread with IORING_SETUP_SQ_AFF, the affinity of the poller thread is set to the cpu specified in sq_thread_cpu. However, this CPU can be outside of the cpuset defined by the cgroup cpuset controller. This violates the rules defined by the cpuset controller and is a potential issue for realtime applications. In b7ed6d8ffd6 we fixed the default affinity of the poller thread, in case no explicit pinning is required by inheriting the one of the creating task. In case of explicit pinning, the check is more complicated, as also a cpu outside of the parent cpumask is allowed. We implemented this by using cpuset_cpus_allowed (that has support for cgroup cpusets) and testing if the requested cpu is in the set. Fixes: 37d1e2e3642e ("io_uring: move SQPOLL thread io-wq forked worker") Cc: stable@vger.kernel.org # 6.1+ Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com> Link: https://lore.kernel.org/r/20240909150036.55921-1-felix.moessbauer@siemens.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-09-09debugobjects: Remove redundant checks in fill_pool()Zhen Lei
fill_pool() checks locklessly at the beginning whether the pool has to be refilled. After that it checks locklessly in a loop whether the free list contains objects and repeats the refill check. If both conditions are true, it acquires the pool lock and tries to move objects from the free list to the pool repeating the same checks again. There are two redundant issues with that: 1) The repeated check for the fill condition 2) The loop processing The repeated check is pointless as it was just established that fill is required. The condition has to be re-evaluated under the lock anyway. The loop processing is not required either because there is practically zero chance that a repeated attempt will succeed if the checks under the lock terminate the moving of objects. Remove the redundant check and replace the loop with a simple if condition. [ tglx: Massaged change log ] Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20240904133944.2124-4-thunder.leizhen@huawei.com
2024-09-09debugobjects: Fix conditions in fill_pool()Zhen Lei
fill_pool() uses 'obj_pool_min_free' to decide whether objects should be handed back to the kmem cache. But 'obj_pool_min_free' records the lowest historical value of the number of objects in the object pool and not the minimum number of objects which should be kept in the pool. Use 'debug_objects_pool_min_level' instead, which holds the minimum number which was scaled to the number of CPUs at boot time. [ tglx: Massage change log ] Fixes: d26bf5056fc0 ("debugobjects: Reduce number of pool_lock acquisitions in fill_pool()") Fixes: 36c4ead6f6df ("debugobjects: Add global free list and the counter") Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20240904133944.2124-3-thunder.leizhen@huawei.com
2024-09-09debugobjects: Fix the compilation attributes of some global variablesZhen Lei
1. Both debug_objects_pool_min_level and debug_objects_pool_size are read-only after initialization, change attribute '__read_mostly' to '__ro_after_init', and remove '__data_racy'. 2. Many global variables are read in the debug_stats_show() function, but didn't mask KCSAN's detection. Add '__data_racy' for them. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20240904133944.2124-2-thunder.leizhen@huawei.com