Age | Commit message (Collapse) | Author |
|
&idpf_tx_buffer is almost identical to the previous generations, as well
as the way it's handled. Moreover, relying on dma_unmap_addr() and
!!buf->skb instead of explicit defining of buffer's type was never good.
Use the newly added libeth helpers to do it properly and reduce the
copy-paste around the Tx code.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Software-side Tx buffers for storing DMA, frame size, skb pointers etc.
are pretty much generic and every driver defines them the same way. The
same can be said for software Tx completions -- same napi_consume_skb()s
and all that...
Add a couple simple wrappers for doing that to stop repeating the old
tale at least within the Intel code. Drivers are free to use 'priv'
member at the end of the structure.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
A helper function defined but not used. This, in particular,
prevents kernel builds with clang, `make W=1` and CONFIG_WERROR=y:
kernel/trace/trace.c:2229:19: error: unused function 'run_tracer_selftest' [-Werror,-Wunused-function]
2229 | static inline int run_tracer_selftest(struct tracer *type)
| ^~~~~~~~~~~~~~~~~~~
Fix this by dropping unused functions.
See also commit 6863f5643dd7 ("kbuild: allow Clang to find unused static
inline functions for W=1 build").
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/20240909105314.928302-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
To fix some critical section races, the interface_lock was added to a few
locations. One of those locations was above where the interface_lock was
declared, so the declaration was moved up before that usage.
Unfortunately, where it was placed was inside a CONFIG_TIMERLAT_TRACER
ifdef block. As the interface_lock is used outside that config, this broke
the build when CONFIG_OSNOISE_TRACER was enabled but
CONFIG_TIMERLAT_TRACER was not.
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: "Helena Anna" <helena.anna.dubel@intel.com>
Cc: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
Cc: Tomas Glozar <tglozar@redhat.com>
Link: https://lore.kernel.org/20240909103231.23a289e2@gandalf.local.home
Fixes: e6a53481da29 ("tracing/timerlat: Only clear timer if a kthread exists")
Reported-by: "Bityutskiy, Artem" <artem.bityutskiy@intel.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
Currently, trying to set the bridge mode attribute when numvfs=0 leads to a
crash:
bridge link set dev eth2 hwmode vepa
[ 168.967392] BUG: kernel NULL pointer dereference, address: 0000000000000030
[...]
[ 168.969989] RIP: 0010:mlx5_add_flow_rules+0x1f/0x300 [mlx5_core]
[...]
[ 168.976037] Call Trace:
[ 168.976188] <TASK>
[ 168.978620] _mlx5_eswitch_set_vepa_locked+0x113/0x230 [mlx5_core]
[ 168.979074] mlx5_eswitch_set_vepa+0x7f/0xa0 [mlx5_core]
[ 168.979471] rtnl_bridge_setlink+0xe9/0x1f0
[ 168.979714] rtnetlink_rcv_msg+0x159/0x400
[ 168.980451] netlink_rcv_skb+0x54/0x100
[ 168.980675] netlink_unicast+0x241/0x360
[ 168.980918] netlink_sendmsg+0x1f6/0x430
[ 168.981162] ____sys_sendmsg+0x3bb/0x3f0
[ 168.982155] ___sys_sendmsg+0x88/0xd0
[ 168.985036] __sys_sendmsg+0x59/0xa0
[ 168.985477] do_syscall_64+0x79/0x150
[ 168.987273] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 168.987773] RIP: 0033:0x7f8f7950f917
(esw->fdb_table.legacy.vepa_fdb is null)
The bridge mode is only relevant when there are multiple functions per
port. Therefore, prevent setting and getting this setting when there are no
VFs.
Note that after this change, there are no settings to change on the PF
interface using `bridge link` when there are no VFs, so the interface no
longer appears in the `bridge link` output.
Fixes: 4b89251de024 ("net/mlx5: Support ndo bridge_setlink and getlink")
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Before creating a scheduling element in a NIC or E-Switch scheduler,
ensure that the requested element type is supported. If the element is
of type Transmit Scheduling Arbiter (TSAR), also verify that the
specific TSAR type is supported.
Fixes: 214baf22870c ("net/mlx5e: Support HTB offload")
Fixes: 85c5f7c9200e ("net/mlx5: E-switch, Create QoS on demand")
Fixes: 0fe132eac38c ("net/mlx5: E-switch, Allow to add vports to rate groups")
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Add the missing masks for supported element types and Transmit
Scheduling Arbiter (TSAR) types in scheduling elements.
Also, add the corresponding bit masks for these types in the QoS
capabilities of a NIC scheduler.
Fixes: 214baf22870c ("net/mlx5e: Support HTB offload")
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Ensure the scheduling element type and TSAR type are explicitly
initialized in the QoS rate group creation.
This prevents potential issues due to default values.
Fixes: 1ae258f8b343 ("net/mlx5: E-switch, Introduce rate limiting groups API")
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Add MLX5E_400GAUI_8_400GBASE_CR8 to the extended modes
in ptys2ext_ethtool_table, since it was missing.
Fixes: 6a897372417e ("net/mlx5: ethtool, Add ethtool support for 50Gbps per lane link modes")
Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Add MLX5E_1000BASE_T and MLX5E_100BASE_TX to the legacy
modes in ptys2legacy_ethtool_table, since they were missing.
Fixes: 665bc53969d7 ("net/mlx5e: Use new ethtool get/set link ksettings API")
Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Add the upcoming ConnectX-9 device ID to the table of supported
PCI device IDs.
Fixes: f908a35b2218 ("net/mlx5: Update the list of the PCI supported devices")
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Enabling HWS support in the mlx5 driver:
- added HWS API header
- added HWS files in the mlx5 driver makefile
- added kconfig flag that enables HWS compilation
Reviewed-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Added implementation of send engine and handling of HWS context.
Reviewed-by: Itamar Gozlan <igozlan@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Added debug dump of the existing HWS state,
and all the required internal definitions.
To dump the HWS state, cat the following debugfs node:
cat /sys/kernel/debug/mlx5/<PCI>/steering/fdb/ctx_<ctx_id>
Reviewed-by: Hamdan Agbariya <hamdani@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Added implementation of backward-compatible (BWC) steering API.
Native HWS API is very different from SWS API:
- SWS is synchronous (rule creation/deletion API call returns when
the rule is created/deleted), while HWS is asynchronous (it
requires polling for completion in order to know when the rule
creation/deletion happened)
- SWS manages its own memory (it allocates/frees all the needed
memory for steering rules, while HWS requires the rules memory
to be allocated/freed outside the API
In order to make HWS fit the existing fs-core steering API paradigm,
this patch adds implementation of backward-compatible (BWC) steering
API that has the bahaviour similar to SWS: among others, it encompasses
all the rules' memory management and completion polling, presenting
the usual synchronous API for the upper layer.
A user that wishes to utilize the full speed potential of HWS can
call the HWS async API and have rule insertion/deletion batching,
lower memory management overhead, and lower CPU utilization.
Such approach will be taken by the future Connection Tracking.
Note that BWC steering doesn't support yet rules that require more
than one match STE - complex rules.
This support will be added later on.
Reviewed-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Added object pools and buddy allocator functionality.
Reviewed-by: Itamar Gozlan <igozlan@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Vport is a virtual eswitch port that is associated with its virtual
function (VF), physical function (PF) or sub-function (SF).
This patch adds handling of vports in HWS.
Reviewed-by: Hamdan Agbariya <hamdani@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Packet headers/metadta manipulations are split into two parts:
- Header Modify Pattern: an object that describes which fields
will be modified and in which way
- Header Modify Argument: an object that provides the values
to be used for header modification
Reviewed-by: Hamdan Agbariya <hamdani@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
This patch adds implementation of FW object handling, such
as creation/destruction, modification, and querying.
Reviewed-by: Hamdan Agbariya <hamdani@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Matcher object encompasses all the building blocks that are
needed in order to perform flow steering of a given flow:
- flow table that serves as entering point of this matcher
- Rule Table Context (RTC) objects to hold ll the Steering
Table Entries (STEs), both for matching the flow and for
performing actions
- rules that describe the set of matching parameters for a
flow and actions to perform in case of a hit.
This patch adds implementation of matchers handling in HWS.
Reviewed-by: Itamar Gozlan <igozlan@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
The Match Definer combines packet fields and a mask,
creating a key which can be used for packet matching
during steering flow processing.
This patch adds handling of definer objects in HWS.
Reviewed-by: Hamdan Agbariya <hamdani@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Steering rule is a concept that includes match parameters for a flow,
and actions to perform on the flows that match these parameters.
This patch adds rules handling part of HW Steering.
Reviewed-by: Itamar Gozlan <igozlan@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Flow tables are SW objects that are comprised of list of matchers,
that in turn define the properties of a flow to match on and set
of actions to perform on the flows in case of match hit or miss.
Reviewed-by: Itamar Gozlan <igozlan@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
When a packet matches a flow, the actions specified for the flow are
applied. The supported actions include (but not limited to) the following:
- drop: packet processing is stopped
- go to vport: packet is forwarded to a specified vport
- go to flow table: packet is forwarded to a specified table
and processing continues there
- push/pop vlan: add/remove vlan header respectively to/from the packet
- insert/remove header: add/remove a user-defined header to/from
the packet
- counter: count the packet bytes in the specified counter
- tag: tag the matching flow with a provided tag value
- reformat: change the packet format by adding or removing some
of its headers
- modify header: modify the value of the packet headers with
set/add/copy ops
- range: match packet on range of values
Reviewed-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
As part of preparation for HWS, added missing definitions
in qp.h and fs_core.h:
- FS_FT_FDB_RX/TX table types that are used by HWS in addition
to an existing FS_FT_FDB
- MLX5_WQE_CTRL_INITIATOR_SMALL_FENCE that is used by HWS to
require fence in WQE
Reviewed-by: Hamdan Agbariya <hamdani@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Add mlx5_ifc definitions that are required for HWS support.
Note that due to change in the mlx5_ifc_flow_table_context_bits
structure that now includes both SWS and HWS bits in a union,
this patch also includes small change in one of SWS files that
was required for compilation.
Reviewed-by: Hamdan Agbariya <hamdani@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Always call igb_xdp_ring_update_tail() under __netif_tx_lock, add a comment
and lockdep assert to indicate that. This is needed to share the same TX
ring between XDP, XSK and slow paths. Furthermore, the current XDP
implementation is racy on tail updates.
Fixes: 9cbc948b5a20 ("igb: add XDP support")
Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
[Kurt: Add lockdep assert and fixes tag]
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The description of function ice_find_vsi_list_entry says:
Search VSI list map with VSI count 1
However, since the blamed commit (see Fixes below), the function no
longer checks vsi_count. This causes a problem in ice_add_vlan_internal,
where the decision to share VSI lists between filter rules relies on the
vsi_count of the found existing VSI list being 1.
The reproducing steps:
1. Have a PF and two VFs.
There will be a filter rule for VLAN 0, referring to a VSI list
containing VSIs: 0 (PF), 2 (VF#0), 3 (VF#1).
2. Add VLAN 1234 to VF#0.
ice will make the wrong decision to share the VSI list with the new
rule. The wrong behavior may not be immediately apparent, but it can
be observed with debug prints.
3. Add VLAN 1234 to VF#1.
ice will unshare the VSI list for the VLAN 1234 rule. Due to the
earlier bad decision, the newly created VSI list will contain
VSIs 0 (PF) and 3 (VF#1), instead of expected 2 (VF#0) and 3 (VF#1).
4. Try pinging a network peer over the VLAN interface on VF#0.
This fails.
Reproducer script at:
https://gitlab.com/mschmidt2/repro/-/blob/master/RHEL-46814/test-vlan-vsi-list-confusion.sh
Commented debug trace:
https://gitlab.com/mschmidt2/repro/-/blob/master/RHEL-46814/ice-vlan-vsi-lists-debug.txt
Patch adding the debug prints:
https://gitlab.com/mschmidt2/linux/-/commit/f8a8814623944a45091a77c6094c40bfe726bfdb
(Unsafe, by the way. Lacks rule_lock when dumping in ice_remove_vlan.)
Michal Swiatkowski added to the explanation that the bug is caused by
reusing a VSI list created for VLAN 0. All created VFs' VSIs are added
to VLAN 0 filter. When a non-zero VLAN is created on a VF which is already
in VLAN 0 (normal case), the VSI list from VLAN 0 is reused.
It leads to a problem because all VFs (VSIs to be specific) that are
subscribed to VLAN 0 will now receive a new VLAN tag traffic. This is
one bug, another is the bug described above. Removing filters from
one VF will remove VLAN filter from the previous VF. It happens a VF is
reset. Example:
- creation of 3 VFs
- we have VSI list (used for VLAN 0) [0 (pf), 2 (vf1), 3 (vf2), 4 (vf3)]
- we are adding VLAN 100 on VF1, we are reusing the previous list
because 2 is there
- VLAN traffic works fine, but VLAN 100 tagged traffic can be received
on all VSIs from the list (for example broadcast or unicast)
- trust is turning on VF2, VF2 is resetting, all filters from VF2 are
removed; the VLAN 100 filter is also removed because 3 is on the list
- VLAN traffic to VF1 isn't working anymore, there is a need to recreate
VLAN interface to readd VLAN filter
One thing I'm not certain about is the implications for the LAG feature,
which is another caller of ice_find_vsi_list_entry. I don't have a
LAG-capable card at hand to test.
Fixes: 23ccae5ce15f ("ice: changes to the interface with the HW and FW for SRIOV_VF+LAG")
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Dave Ertman <David.m.ertman@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Our driver uses devres to manage resources, in particular we call
pcim_enable_device(), what also means we express the intent to get
automatic pci_disable_device() call at driver removal. Manual calls to
pci_disable_device() misuse the API.
Recent commit (see "Fixes" tag) has changed the removal action from
conditional (silent ignore of double call to pci_disable_device()) to
unconditional, but able to catch unwanted redundant calls; see cited
"Fixes" commit for details.
Since that, unloading the driver yields following warn+splat:
[70633.628490] ice 0000:af:00.7: disabling already-disabled device
[70633.628512] WARNING: CPU: 52 PID: 33890 at drivers/pci/pci.c:2250 pci_disable_device+0xf4/0x100
...
[70633.628744] ? pci_disable_device+0xf4/0x100
[70633.628752] release_nodes+0x4a/0x70
[70633.628759] devres_release_all+0x8b/0xc0
[70633.628768] device_unbind_cleanup+0xe/0x70
[70633.628774] device_release_driver_internal+0x208/0x250
[70633.628781] driver_detach+0x47/0x90
[70633.628786] bus_remove_driver+0x80/0x100
[70633.628791] pci_unregister_driver+0x2a/0xb0
[70633.628799] ice_module_exit+0x11/0x3a [ice]
Note that this is the only Intel ethernet driver that needs such fix.
Fixes: f748a07a0b64 ("PCI: Remove legacy pcim_release()")
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
When adding a switch filter (such as a MAC or VLAN filter), it is expected
that the driver will detect the case where the filter already exists, and
return -EEXIST. This is used by calling code such as ice_vc_add_mac_addr,
and ice_vsi_add_vlan to avoid incrementing the accounting fields such as
vsi->num_vlan or vf->num_mac.
This logic works correctly for the case where only a single VSI has added a
given switch filter.
When a second VSI adds the same switch filter, the driver converts the
existing filter from an ICE_FWD_TO_VSI filter into an ICE_FWD_TO_VSI_LIST
filter. This saves switch resources, by ensuring that multiple VSIs can
re-use the same filter.
The ice_add_update_vsi_list() function is responsible for doing this
conversion. When first converting a filter from the FWD_TO_VSI into
FWD_TO_VSI_LIST, it checks if the VSI being added is the same as the
existing rule's VSI. In such a case it returns -EEXIST.
However, when the switch rule has already been converted to a
FWD_TO_VSI_LIST, the logic is different. Adding a new VSI in this case just
requires extending the VSI list entry. The logic for checking if the rule
already exists in this case returns 0 instead of -EEXIST.
This breaks the accounting logic mentioned above, so the counters for how
many MAC and VLAN filters exist for a given VF or VSI no longer accurately
reflect the actual count. This breaks other code which relies on these
counts.
In typical usage this primarily affects such filters generally shared by
multiple VSIs such as VLAN 0, or broadcast and multicast MAC addresses.
Fix this by correctly reporting -EEXIST in the case of adding the same VSI
to a switch rule already converted to ICE_FWD_TO_VSI_LIST.
Fixes: 9daf8208dd4d ("ice: Add support for switch filter programming")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
After vsi setup refactor commit 6624e780a577 ("ice: split ice_vsi_setup
into smaller functions") ice_cfg_sw_lldp function which removes rx rule
directing LLDP packets to vsi is moved from ice_vsi_release to
ice_vsi_decfg function. ice_vsi_decfg is used in more cases than just in
vsi_release resulting in unnecessary removal of rx lldp packets handling
switch rule. This leads to lldp packets being dropped after a change number
of channels via ethtool.
This patch moves ice_cfg_sw_lldp function that removes rx lldp sw rule back
to ice_vsi_release function.
Fixes: 6624e780a577 ("ice: split ice_vsi_setup into smaller functions")
Reported-by: Matěj Grégr <mgregr@netx.as>
Closes: https://lore.kernel.org/intel-wired-lan/1be45a76-90af-4813-824f-8398b69745a9@netx.as/T/#u
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@linux.intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The current implementation of pmbus_show_boolean assumes that all devices
support write-back operation of status register to clear pending warnings
or faults. Since clearing individual bits in the status registers was only
introduced in PMBus specification 1.2, this operation may not be supported
by some older devices. This can result in an error while reading boolean
attributes such as temp1_max_alarm.
Fetch PMBus revision supported by the device and modify pmbus_show_boolean
so that it only tries to clear individual status bits if the device is
compliant with PMBus specs >= 1.2. Otherwise clear all fault indicators
on the current page after a fault status was reported.
Fixes: 35f165f08950a ("hwmon: (pmbus) Clear pmbus fault/warning bits after read")
Signed-off-by: Patryk Biel <pbiel7@gmail.com>
Message-ID: <20240909-pmbus-status-reg-clearing-v1-1-f1c0d68c6408@gmail.com>
[groeck:
Rewrote description
Moved revision detection code ahead of clear faults command
Assigned revision if return value from PMBUS_REVISION command is 0
Improved return value check from calling _pmbus_write_byte_data()]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
|
|
Some DSDT-s have an off-by-one bug where the SINF package count is
one higher than the SQTY reported value, allocate 1 entry extra.
Also make the SQTY <-> SINF package count mismatch error more verbose
to help debugging similar issues in the future.
This fixes the panasonic-laptop driver failing to probe() on some
devices with the following errors:
[ 3.958887] SQTY reports bad SINF length SQTY: 37 SINF-pkg-count: 38
[ 3.958892] Couldn't retrieve BIOS data
[ 3.983685] Panasonic Laptop Support - With Macros: probe of MAT0019:00 failed with error -5
Fixes: 709ee531c153 ("panasonic-laptop: add Panasonic Let's Note laptop extras driver v0.94")
Cc: stable@vger.kernel.org
Tested-by: James Harmison <jharmison@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20240909113227.254470-2-hdegoede@redhat.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
The panasonic laptop code in various places uses the SINF array with index
values of 0 - SINF_CUR_BRIGHT(0x0d) without checking that the SINF array
is big enough.
Not all panasonic laptops have this many SINF array entries, for example
the Toughbook CF-18 model only has 10 SINF array entries. So it only
supports the AC+DC brightness entries and mute.
Check that the SINF array has a minimum size which covers all AC+DC
brightness entries and refuse to load if the SINF array is smaller.
For higher SINF indexes hide the sysfs attributes when the SINF array
does not contain an entry for that attribute, avoiding show()/store()
accessing the array out of bounds and add bounds checking to the probe()
and resume() code accessing these.
Fixes: e424fb8cc4e6 ("panasonic-laptop: avoid overflow in acpi_pcc_hotkey_add()")
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20240909113227.254470-1-hdegoede@redhat.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ath/ath
ath.git patches for v6.12
This is once again a fairly light pull request since ath12k is still
working on MLO-related changes, and the other drivers are mostly in
maintenance mode.
ath12k
* Fix a frame-larger-than warning seen with debug builds
* Fix flex-array-member-not-at-end warnings
ath11k
* Fix flex-array-member-not-at-end warnings
ath9k
* Fix a syzbot-reported issue on USB-based devices
|
|
mt76 patches for 6.12
- fixes
- mt7915 .sta_state support
- mt7915 hardware restart improvements
|
|
Pull bcachefs fixes from Kent Overstreet:
- fix ca->io_ref usage; analagous to previous patch doing that for main
discard path
- cond_resched() in __journal_keys_sort(), cutting down on "hung task"
warnings when journal is big
- rest of basic BCH_SB_MEMBER_INVALID support
- and the critical one: don't delete open files in online fsck, this
was causing the "dirent points to inode that doesn't point back"
inconsistencies some users were seeing
* tag 'bcachefs-2024-09-09' of git://evilpiepirate.org/bcachefs:
bcachefs: Don't delete open files in online fsck
bcachefs: fix btree_key_cache sysfs knob
bcachefs: More BCH_SB_MEMBER_INVALID support
bcachefs: Simplify bch2_bkey_drop_ptrs()
bcachefs: Add a cond_resched() to __journal_keys_sort()
bcachefs: Fix ca->io_ref usage
|
|
If we get a failure at the first decompressor init (i = 0),
the clean up while loop could enter infinite loop due to wrong while
check. Check the value of i now to see if we need any clean up at all.
Fixes: 5a7cce827ee9 ("erofs: refine z_erofs_{init,exit}_subsystem()")
Reported-by: liujinbao1 <liujinbao1@xiaomi.com>
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20240905060027.2388893-1-dhavale@google.com
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
After commit 684b290abc77 ("erofs: add support for
FS_IOC_GETFSSYSFSPATH"), `sb->s_sysfs_name` is now valid.
Just use it to get rid of duplicated logic.
Reviewed-by: Sandeep Dhavale <dhavale@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240828095232.571946-1-hsiangkao@linux.alibaba.com
|
|
Fast symlink can be used if the on-disk symlink data is stored
in the same block as the on-disk inode, so we don’t need to trigger
another I/O for symlink data. However, currently fs correction could be
reported _incorrectly_ if inode xattrs are too large.
In fact, these should be valid images although they cannot be handled as
fast symlinks.
Many thanks to Colin for reporting this!
Reported-by: Colin Walters <walters@verbum.org>
Reported-by: https://honggfuzz.dev/
Link: https://lore.kernel.org/r/bb2dd430-7de0-47da-ae5b-82ab2dd4d945@app.fastmail.com
Fixes: 431339ba9042 ("staging: erofs: add inode operations")
[ Note that it's a runtime misbehavior instead of a security issue. ]
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240909031911.1174718-1-hsiangkao@linux.alibaba.com
|
|
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux
Merge devfreq updates for v6.12 from Chanwoo Choi:
"Detailed description for this pull request:
- Add missing MODULE_DESCRIPTION() macros for devfreq governors.
- Use Use devm_clk_get_enabled() helpers for exyns-bus devfreq driver.
- Use of_property_present() instead of of_get_property() for imx-bus
devfreq driver."
* tag 'devfreq-next-for-6.12' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux:
PM / devfreq: imx-bus: Use of_property_present()
PM / devfreq: exynos: Use Use devm_clk_get_enabled() helpers
PM/devfreq: governor: add missing MODULE_DESCRIPTION() macros
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv fixes from Wei Liu:
- Add a documentation overview of Confidential Computing VM support
(Michael Kelley)
- Use lapic timer in a TDX VM without paravisor (Dexuan Cui)
- Set X86_FEATURE_TSC_KNOWN_FREQ when Hyper-V provides frequency
(Michael Kelley)
- Fix a kexec crash due to VP assist page corruption (Anirudh
Rayabharam)
- Python3 compatibility fix for lsvmbus (Anthony Nandaa)
- Misc fixes (Rachel Menge, Roman Kisel, zhang jiao, Hongbo Li)
* tag 'hyperv-fixes-signed-20240908' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
hv: vmbus: Constify struct kobj_type and struct attribute_group
tools: hv: rm .*.cmd when make clean
x86/hyperv: fix kexec crash due to VP assist page corruption
Drivers: hv: vmbus: Fix the misplaced function description
tools: hv: lsvmbus: change shebang to use python3
x86/hyperv: Set X86_FEATURE_TSC_KNOWN_FREQ when Hyper-V provides frequency
Documentation: hyperv: Add overview of Confidential Computing VM support
clocksource: hyper-v: Use lapic timer in a TDX VM without paravisor
Drivers: hv: Remove deprecated hv_fcopy declarations
|
|
Highlight that the file_set_fowner hook is now called with a lock held.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: James Morris <jmorris@namei.org>
Cc: Jann Horn <jannh@google.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Serge E. Hallyn <serge@hallyn.com>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
The fcntl's F_SETOWN command sets the process that handle SIGIO/SIGURG
for the related file descriptor. Before this change, the
file_set_fowner LSM hook was always called, ignoring the VFS logic which
may not actually change the process that handles SIGIO (e.g. TUN, TTY,
dnotify), nor update the related UID/EUID.
Moreover, because security_file_set_fowner() was called without lock
(e.g. f_owner.lock), concurrent F_SETOWN commands could result to a race
condition and inconsistent LSM states (e.g. SELinux's fown_sid) compared
to struct fown_struct's UID/EUID.
This change makes sure the LSM states are always in sync with the VFS
state by moving the security_file_set_fowner() call close to the
UID/EUID updates and using the same f_owner.lock .
Rename f_modown() to __f_setown() to simplify code.
Cc: stable@vger.kernel.org
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: James Morris <jmorris@namei.org>
Cc: Jann Horn <jannh@google.com>
Cc: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Serge E. Hallyn <serge@hallyn.com>
Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
Merge thermal core fixes and cleanups for 6.12:
- Refuse to accept trip point temperature or hysteresis that would lead
to an invalid threshold value when setting them via sysfs (Rafael
Wysocki).
- Adjust states of all uninitialized instances in the .manage()
callback of the Bang-bang thermal governor (Rafael Wysocki).
- Drop a couple of redundant checks along with the code depending on
them from the thermal core (Rafael Wysocki).
- Rearrange the thermal core to avoid redundant checks and simplify
control flow in a couple of code paths (Rafael Wysocki).
* thermal-core:
thermal: core: Drop thermal_zone_device_is_enabled()
thermal: core: Check passive delay in monitor_thermal_zone()
thermal: core: Drop dead code from monitor_thermal_zone()
thermal: core: Drop redundant lockdep_assert_held()
thermal: gov_bang_bang: Adjust states of all uninitialized instances
thermal: sysfs: Add sanity checks for trip temperature and hysteresis
|
|
When building serial_base as a module, modpost fails with the following
error message:
ERROR: modpost: "match_devname_and_update_preferred_console"
[drivers/tty/serial/serial_base.ko] undefined!
Export the symbol to allow using it from modules.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202409071312.qlwtTOS1-lkp@intel.com/
Fixes: 12c91cec3155 ("serial: core: Add serial_base_match_and_update_preferred_console()")
Signed-off-by: Yu Liao <liaoyu15@huawei.com>
Link: https://lore.kernel.org/r/20240909075652.747370-1-liaoyu15@huawei.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
|
|
The submit queue polling threads are userland threads that just never
exit to the userland. When creating the thread with IORING_SETUP_SQ_AFF,
the affinity of the poller thread is set to the cpu specified in
sq_thread_cpu. However, this CPU can be outside of the cpuset defined
by the cgroup cpuset controller. This violates the rules defined by the
cpuset controller and is a potential issue for realtime applications.
In b7ed6d8ffd6 we fixed the default affinity of the poller thread, in
case no explicit pinning is required by inheriting the one of the
creating task. In case of explicit pinning, the check is more
complicated, as also a cpu outside of the parent cpumask is allowed.
We implemented this by using cpuset_cpus_allowed (that has support for
cgroup cpusets) and testing if the requested cpu is in the set.
Fixes: 37d1e2e3642e ("io_uring: move SQPOLL thread io-wq forked worker")
Cc: stable@vger.kernel.org # 6.1+
Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
Link: https://lore.kernel.org/r/20240909150036.55921-1-felix.moessbauer@siemens.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
fill_pool() checks locklessly at the beginning whether the pool has to be
refilled. After that it checks locklessly in a loop whether the free list
contains objects and repeats the refill check.
If both conditions are true, it acquires the pool lock and tries to move
objects from the free list to the pool repeating the same checks again.
There are two redundant issues with that:
1) The repeated check for the fill condition
2) The loop processing
The repeated check is pointless as it was just established that fill is
required. The condition has to be re-evaluated under the lock anyway.
The loop processing is not required either because there is practically
zero chance that a repeated attempt will succeed if the checks under the
lock terminate the moving of objects.
Remove the redundant check and replace the loop with a simple if condition.
[ tglx: Massaged change log ]
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240904133944.2124-4-thunder.leizhen@huawei.com
|
|
fill_pool() uses 'obj_pool_min_free' to decide whether objects should be
handed back to the kmem cache. But 'obj_pool_min_free' records the lowest
historical value of the number of objects in the object pool and not the
minimum number of objects which should be kept in the pool.
Use 'debug_objects_pool_min_level' instead, which holds the minimum number
which was scaled to the number of CPUs at boot time.
[ tglx: Massage change log ]
Fixes: d26bf5056fc0 ("debugobjects: Reduce number of pool_lock acquisitions in fill_pool()")
Fixes: 36c4ead6f6df ("debugobjects: Add global free list and the counter")
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20240904133944.2124-3-thunder.leizhen@huawei.com
|
|
1. Both debug_objects_pool_min_level and debug_objects_pool_size are
read-only after initialization, change attribute '__read_mostly' to
'__ro_after_init', and remove '__data_racy'.
2. Many global variables are read in the debug_stats_show() function, but
didn't mask KCSAN's detection. Add '__data_racy' for them.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240904133944.2124-2-thunder.leizhen@huawei.com
|