Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2022-05-02
1) Trivial Misc updates to mlx5 driver
2) From Mark Bloch: Flow steering, general steering refactoring/cleaning
An issue with flow steering deletion flow (when creating a rule without
dests) turned out to be easy to fix but during the fix some issue
with the flow steering creation/deletion flows have been found.
The following patch series tries to fix long standing issues with flow
steering code and hopefully preventing silly future bugs.
A) Fix an issue where a proper dest type wasn't assigned.
B) Refactor and fix dests enums values, refactor deletion
function and do proper bookkeeping of dests.
C) Change mlx5_del_flow_rules() to delete rules when there are no
no more rules attached associated with an FTE.
D) Don't call hard coded deletion function but use the node's
defined one.
E) Add a WARN_ON() to catch future bugs when an FTE with dests
is deleted.
* tag 'mlx5-updates-2022-05-02' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5: fs, an FTE should have no dests when deleted
net/mlx5: fs, call the deletion function of the node
net/mlx5: fs, delete the FTE when there are no rules attached to it
net/mlx5: fs, do proper bookkeeping for forward destinations
net/mlx5: fs, add unused destination type
net/mlx5: fs, jump to exit point and don't fall through
net/mlx5: fs, refactor software deletion rule
net/mlx5: fs, split software and IFC flow destination definitions
net/mlx5e: TC, set proper dest type
net/mlx5e: Remove unused mlx5e_dcbnl_build_rep_netdev function
net/mlx5e: Drop error CQE handling from the XSK RX handler
net/mlx5: Print initializing field in case of timeout
net/mlx5: Delete redundant default assignment of runtime devlink params
net/mlx5: Remove useless kfree
net/mlx5: use kvfree() for kvzalloc() in mlx5_ct_fs_smfs_matcher_create
====================
Link: https://lore.kernel.org/r/
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Spectrum machines have two resources related to keeping packets in an
internal buffer: bytes (allocated in cell-sized units) for packet payload,
and descriptors, for keeping metadata. Currently, mlxsw only configures the
bytes part of the resource management.
Spectrum switches permit a full parallel configuration for the descriptor
resources, including port-pool and port-TC-pool quotas. By default, these
are all configured to use pool 14, with an infinite quota. The ingress pool
14 is then infinite in size.
However, egress pool 14 has finite size by default. The size is chip
dependent, but always much lower than what the chip actually permits. As a
result, we can easily construct workloads that exhaust the configured
descriptor limit.
Fix the issue by configuring the egress descriptor pool to be infinite in
size as well. This will maintain the configuration philosophy of the
default configuration, but will unlock all chip resources to be usable.
In the code, include both the configuration of ingress and ingress, mostly
for clarity.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
SBPR, or Shared Buffer Pools Register, configures and retrieves the shared
buffer pools and configuration. The desc field determines whether the
configuration relates to the byte pool or the descriptor pool.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
When deleting an FTE it should have no dests, which means
fte->dests_size should be 0. Add a WARN_ON() to catch bugs
where the proper tracking wasn't done.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Don't call del_hw_fte() directly, instead use the hardware deletion
function set. This is just a small cleanup and doesn't change anything
as for an FTE the deletion function is already set to del_hw_fte().
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
When an FTE has no children is means all the rules where removed
and the FTE can be deleted regardless of the dests_size value.
While dests_size should be 0 when there are no children
be extra careful not to leak memory or get firmware syndrome
if the proper bookkeeping of dests_size wasn't done.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Keep track after destinations that are forward destinations.
When a forward destinations is removed from an FTE check if
the actions bits need to be updated.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
When the caller doesn't pass a destination fs_core will create a unused
rule just so a context can be returned. This unused rule
is zeroed out and its type is 0 which can be mixed up with
MLX5_FLOW_DESTINATION_TYPE_VPORT.
Create a dedicated type to differentiate between the two
named MLX5_FLOW_DESTINATION_TYPE_NONE.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
For code clarity and to prevent future bugs make sure to jump
to the exit point once done handling that specific type.
This aligns the code with the rest logic in the function.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
When deleting a rule make sure that for every type dests_size is
decreased only once and no other logic is executed.
Without this dests_size might be decreased twice when dests_size == 1
so the if for that type won't be entered and if action has
MLX5_FLOW_CONTEXT_ACTION_FWD_DEST set dests_size will be decreased again.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Separate flow destinations between software and IFC.
Flow destination type passed by callers was used as the input in
firmware commands and over the years software only types were added
which resulted in mixing between the two.
Create an IFC enum that contains only the flow destinations defined
when talking to the firmware.
Now that there is a proper software only enum for flow destinations
the hardcoded values can be removed as the values are no longer used
in firmware commands.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Dest type isn't set, this works only because
MLX5_FLOW_DESTINATION_TYPE_VPORT is zero. Set the proper type.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Commit
7a9fb35e8c3a ("net/mlx5e: Do not reload ethernet ports when changing eswitch mode")
removed the usage of mlx5e_dcbnl_build_rep_netdev() from the driver,
delete the function.
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
This commit removes the redundant check and removes the unused cqe parameter
of skb_from_cqe handlers.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Print the initializing field in case of FW couldn't initialize before
timeout. This will help to better understand the root cause in some
cases.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Runtime devlink params always read their values from the get() callbacks.
Also, it is an error to set driverinit_value for params which don't
support driverinit cmode. Delete such assignments.
In addition, move the set of default matching mode inside eswitch code.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
After alloc fail, we do not need to kfree.
Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
The memory of spec is allocated with kvzalloc(), the corresponding
release function should not be kfree(), use kvfree() instead.
Generated by: scripts/coccinelle/api/kfree_mismatch.cocci
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Current memory failure code in the debugfs returns -ENOSPC. This is
normally used for indicating that there is no space left on the
device and is not applicable for memory allocation failures.
Replace this with -ENOMEM.
Signed-off-by: Niels Dossche <dossche.niels@gmail.com>
Link: https://lore.kernel.org/r/20220430194656.44357-1-dossche.niels@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
VxLAN belongs to UDP-based encapsulation protocol. Inner TSO for VxLAN
packet with udpcsum requires offloading of outer header csum.
The device doesn't support outer header csum offload. However, inner TSO
for VxLAN with udpcsum can still work with GSO_PARTIAL offload, which
means outer udp csum computed by stack and inner tcp segmentation finished
by hardware. Thus, the patch enable features "NETIF_F_GSO_UDP_TUNNEL_CSUM"
and "NETIF_F_GSO_PARTIAL" and set gso_partial_features.
Signed-off-by: Fei Qin <fei.qin@corigine.com>
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20220430231150.175270-1-simon.horman@corigine.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.
This patch covers three more drivers which I missed in
commit 5f012b40ef63 ("eth: remove copies of the NAPI_POLL_WEIGHT define").
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Simplify the return expression.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Reduce verbosity of ptp tx timestamp error to reduce excessive log
messages.
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Alok Prasad <palok@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is a desire to share the oclot_stats_layout struct outside of the
current vsc7514 driver. In order to do so, the length of the array needs to
be known at compile time, and defined in the struct ocelot and struct
felix_info.
Since the array is defined in a .c file and would be declared in the header
file via:
extern struct ocelot_stat_layout[];
the size of the array will not be known at compile time to outside modules.
To fix this, remove the need for defining the number of stats at compile
time and allow this number to be determined at initialization.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The PHY subsystem as well as the MIIM mdio driver (in case of the
integrated PHYs) will take care of the resets. A separate reset driver
isn't needed. There is no in-tree user of this feature. Remove the
support.
Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The device info from which conntrack originates is stored in metadata
field of the ct flow to offload now, driver can utilize it to reduce
the number of offloaded flows.
v2: Drop inline keyword from get_netdev_from_rule() signature.
The compiler can decide.
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20220429075124.128589-1-simon.horman@corigine.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch extends the EF100 PF driver by adding .sriov_configure()
which would allow users to enable and disable virtual functions
using the sriov sysfs.
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/75e74d9e-14ce-0524-9668-5ab735a7cf62@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.
Acked-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.
Drop the special defines in a bunch of drivers where the
removal is relatively simple so grouping into one patch
does not impact reviewability.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
- Add HW api to configure policer:
- SR TCM policer mode is only supported for now.
- Policer ingress/egress direction support.
- Add police action support into flower
Signed-off-by: Volodymyr Mytnyk <volodymyr.mytnyk@plvision.eu>
Link: https://lore.kernel.org/r/1651061148-21321-1-git-send-email-volodymyr.mytnyk@plvision.eu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
include/linux/netdevice.h
net/core/dev.c
6510ea973d8d ("net: Use this_cpu_inc() to increment net->core_stats")
794c24e9921f ("net-core: rx_otherhost_dropped to core_stats")
https://lore.kernel.org/all/20220428111903.5f4304e0@canb.auug.org.au/
drivers/net/wan/cosa.c
d48fea8401cf ("net: cosa: fix error check return value of register_chrdev()")
89fbca3307d4 ("net: wan: remove support for COSA and SRP synchronous serial boards")
https://lore.kernel.org/all/20220428112130.1f689e5e@canb.auug.org.au/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This reverts commit 723ad916134784b317b72f3f6cf0f7ba774e5dae
When client requests channel or ring size larger than what the server
can support the server will cap the request to the supported max. So,
the client would not be able to successfully request resources that
exceed the server limit.
Fixes: 723ad9161347 ("ibmvnic: Add ethtool private flag for driver-defined queue limits")
Signed-off-by: Dany Madden <drt@linux.ibm.com>
Link: https://lore.kernel.org/r/20220427235146.23189-1-drt@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The Time-Specified Departure feature is indeed mutually exclusive with
TX IP checksumming in ENETC, but TX checksumming in itself is broken and
was removed from this driver in commit 82728b91f124 ("enetc: Remove Tx
checksumming offload code").
The blamed commit declared NETIF_F_HW_CSUM in dev->features to comply
with software TSO's expectations, and still did the checksumming in
software by calling skb_checksum_help(). So there isn't any restriction
for the Time-Specified Departure feature.
However, enetc_setup_tc_txtime() doesn't understand that, and blindly
looks for NETIF_F_CSUM_MASK.
Instead of checking for things which can literally never happen in the
current code base, just remove the check and let the driver offload
tc-etf qdiscs.
Fixes: acede3c5dad5 ("net: enetc: declare NETIF_F_HW_CSUM and do it in software")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220427203017.1291634-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The VF driver can forward any IPsec flags and such makes the function
is not extendable and prone to backward/forward incompatibility.
If new software runs on VF, it won't know that PF configured something
completely different as it "knows" only XFRM_OFFLOAD_INBOUND flag.
Fixes: eda0333ac293 ("ixgbe: add VF IPsec management")
Reviewed-by: Raed Salem <raeds@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Shannon Nelson <snelson@pensando.io>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20220427173152.443102-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Put device node in error path in fec_enet_init_stop_mode().
Fixes: 8a448bf832af ("net: ethernet: fec: move GPR register offset and bit into DT")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20220426125231.375688-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
While handling PCI errors (AER flow) driver tries to
disable NAPI [napi_disable()] after NAPI is deleted
[__netif_napi_del()] which causes unexpected system
hang/crash.
System message log shows the following:
=======================================
[ 3222.537510] EEH: Detected PCI bus error on PHB#384-PE#800000 [ 3222.537511] EEH: This PCI device has failed 2 times in the last hour and will be permanently disabled after 5 failures.
[ 3222.537512] EEH: Notify device drivers to shutdown [ 3222.537513] EEH: Beginning: 'error_detected(IO frozen)'
[ 3222.537514] EEH: PE#800000 (PCI 0384:80:00.0): Invoking
bnx2x->error_detected(IO frozen)
[ 3222.537516] bnx2x: [bnx2x_io_error_detected:14236(eth14)]IO error detected [ 3222.537650] EEH: PE#800000 (PCI 0384:80:00.0): bnx2x driver reports:
'need reset'
[ 3222.537651] EEH: PE#800000 (PCI 0384:80:00.1): Invoking
bnx2x->error_detected(IO frozen)
[ 3222.537651] bnx2x: [bnx2x_io_error_detected:14236(eth13)]IO error detected [ 3222.537729] EEH: PE#800000 (PCI 0384:80:00.1): bnx2x driver reports:
'need reset'
[ 3222.537729] EEH: Finished:'error_detected(IO frozen)' with aggregate recovery state:'need reset'
[ 3222.537890] EEH: Collect temporary log [ 3222.583481] EEH: of node=0384:80:00.0 [ 3222.583519] EEH: PCI device/vendor: 168e14e4 [ 3222.583557] EEH: PCI cmd/status register: 00100140 [ 3222.583557] EEH: PCI-E capabilities and status follow:
[ 3222.583744] EEH: PCI-E 00: 00020010 012c8da2 00095d5e 00455c82 [ 3222.583892] EEH: PCI-E 10: 10820000 00000000 00000000 00000000 [ 3222.583893] EEH: PCI-E 20: 00000000 [ 3222.583893] EEH: PCI-E AER capability register set follows:
[ 3222.584079] EEH: PCI-E AER 00: 13c10001 00000000 00000000 00062030 [ 3222.584230] EEH: PCI-E AER 10: 00002000 000031c0 000001e0 00000000 [ 3222.584378] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000 [ 3222.584416] EEH: PCI-E AER 30: 00000000 00000000 [ 3222.584416] EEH: of node=0384:80:00.1 [ 3222.584454] EEH: PCI device/vendor: 168e14e4 [ 3222.584491] EEH: PCI cmd/status register: 00100140 [ 3222.584492] EEH: PCI-E capabilities and status follow:
[ 3222.584677] EEH: PCI-E 00: 00020010 012c8da2 00095d5e 00455c82 [ 3222.584825] EEH: PCI-E 10: 10820000 00000000 00000000 00000000 [ 3222.584826] EEH: PCI-E 20: 00000000 [ 3222.584826] EEH: PCI-E AER capability register set follows:
[ 3222.585011] EEH: PCI-E AER 00: 13c10001 00000000 00000000 00062030 [ 3222.585160] EEH: PCI-E AER 10: 00002000 000031c0 000001e0 00000000 [ 3222.585309] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000 [ 3222.585347] EEH: PCI-E AER 30: 00000000 00000000 [ 3222.586872] RTAS: event: 5, Type: Platform Error (224), Severity: 2 [ 3222.586873] EEH: Reset without hotplug activity [ 3224.762767] EEH: Beginning: 'slot_reset'
[ 3224.762770] EEH: PE#800000 (PCI 0384:80:00.0): Invoking
bnx2x->slot_reset()
[ 3224.762771] bnx2x: [bnx2x_io_slot_reset:14271(eth14)]IO slot reset initializing...
[ 3224.762887] bnx2x 0384:80:00.0: enabling device (0140 -> 0142) [ 3224.768157] bnx2x: [bnx2x_io_slot_reset:14287(eth14)]IO slot reset
--> driver unload
Uninterruptible tasks
=====================
crash> ps | grep UN
213 2 11 c000000004c89e00 UN 0.0 0 0 [eehd]
215 2 0 c000000004c80000 UN 0.0 0 0
[kworker/0:2]
2196 1 28 c000000004504f00 UN 0.1 15936 11136 wickedd
4287 1 9 c00000020d076800 UN 0.0 4032 3008 agetty
4289 1 20 c00000020d056680 UN 0.0 7232 3840 agetty
32423 2 26 c00000020038c580 UN 0.0 0 0
[kworker/26:3]
32871 4241 27 c0000002609ddd00 UN 0.1 18624 11648 sshd
32920 10130 16 c00000027284a100 UN 0.1 48512 12608 sendmail
33092 32987 0 c000000205218b00 UN 0.1 48512 12608 sendmail
33154 4567 16 c000000260e51780 UN 0.1 48832 12864 pickup
33209 4241 36 c000000270cb6500 UN 0.1 18624 11712 sshd
33473 33283 0 c000000205211480 UN 0.1 48512 12672 sendmail
33531 4241 37 c00000023c902780 UN 0.1 18624 11648 sshd
EEH handler hung while bnx2x sleeping and holding RTNL lock
===========================================================
crash> bt 213
PID: 213 TASK: c000000004c89e00 CPU: 11 COMMAND: "eehd"
#0 [c000000004d477e0] __schedule at c000000000c70808
#1 [c000000004d478b0] schedule at c000000000c70ee0
#2 [c000000004d478e0] schedule_timeout at c000000000c76dec
#3 [c000000004d479c0] msleep at c0000000002120cc
#4 [c000000004d479f0] napi_disable at c000000000a06448
^^^^^^^^^^^^^^^^
#5 [c000000004d47a30] bnx2x_netif_stop at c0080000018dba94 [bnx2x]
#6 [c000000004d47a60] bnx2x_io_slot_reset at c0080000018a551c [bnx2x]
#7 [c000000004d47b20] eeh_report_reset at c00000000004c9bc
#8 [c000000004d47b90] eeh_pe_report at c00000000004d1a8
#9 [c000000004d47c40] eeh_handle_normal_event at c00000000004da64
And the sleeping source code
============================
crash> dis -ls c000000000a06448
FILE: ../net/core/dev.c
LINE: 6702
6697 {
6698 might_sleep();
6699 set_bit(NAPI_STATE_DISABLE, &n->state);
6700
6701 while (test_and_set_bit(NAPI_STATE_SCHED, &n->state))
* 6702 msleep(1);
6703 while (test_and_set_bit(NAPI_STATE_NPSVC, &n->state))
6704 msleep(1);
6705
6706 hrtimer_cancel(&n->timer);
6707
6708 clear_bit(NAPI_STATE_DISABLE, &n->state);
6709 }
EEH calls into bnx2x twice based on the system log above, first through
bnx2x_io_error_detected() and then bnx2x_io_slot_reset(), and executes
the following call chains:
bnx2x_io_error_detected()
+-> bnx2x_eeh_nic_unload()
+-> bnx2x_del_all_napi()
+-> __netif_napi_del()
bnx2x_io_slot_reset()
+-> bnx2x_netif_stop()
+-> bnx2x_napi_disable()
+->napi_disable()
Fix this by correcting the sequence of NAPI APIs usage,
that is delete the NAPI after disabling it.
Fixes: 7fa6f34081f1 ("bnx2x: AER revised")
Reported-by: David Christensen <drc@linux.vnet.ibm.com>
Tested-by: David Christensen <drc@linux.vnet.ibm.com>
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Link: https://lore.kernel.org/r/20220426153913.6966-1-manishc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Daniel Borkmann says:
====================
pull-request: bpf-next 2022-04-27
We've added 85 non-merge commits during the last 18 day(s) which contain
a total of 163 files changed, 4499 insertions(+), 1521 deletions(-).
The main changes are:
1) Teach libbpf to enhance BPF verifier log with human-readable and relevant
information about failed CO-RE relocations, from Andrii Nakryiko.
2) Add typed pointer support in BPF maps and enable it for unreferenced pointers
(via probe read) and referenced ones that can be passed to in-kernel helpers,
from Kumar Kartikeya Dwivedi.
3) Improve xsk to break NAPI loop when rx queue gets full to allow for forward
progress to consume descriptors, from Maciej Fijalkowski & Björn Töpel.
4) Fix a small RCU read-side race in BPF_PROG_RUN routines which dereferenced
the effective prog array before the rcu_read_lock, from Stanislav Fomichev.
5) Implement BPF atomic operations for RV64 JIT, and add libbpf parsing logic
for USDT arguments under riscv{32,64}, from Pu Lehui.
6) Implement libbpf parsing of USDT arguments under aarch64, from Alan Maguire.
7) Enable bpftool build for musl and remove nftw with FTW_ACTIONRETVAL usage
so it can be shipped under Alpine which is musl-based, from Dominique Martinet.
8) Clean up {sk,task,inode} local storage trace RCU handling as they do not
need to use call_rcu_tasks_trace() barrier, from KP Singh.
9) Improve libbpf API documentation and fix error return handling of various
API functions, from Grant Seltzer.
10) Enlarge offset check for bpf_skb_{load,store}_bytes() helpers given data
length of frags + frag_list may surpass old offset limit, from Liu Jian.
11) Various improvements to prog_tests in area of logging, test execution
and by-name subtest selection, from Mykola Lysenko.
12) Simplify map_btf_id generation for all map types by moving this process
to build time with help of resolve_btfids infra, from Menglong Dong.
13) Fix a libbpf bug in probing when falling back to legacy bpf_probe_read*()
helpers; the probing caused always to use old helpers, from Runqing Yang.
14) Add support for ARCompact and ARCv2 platforms for libbpf's PT_REGS
tracing macros, from Vladimir Isaev.
15) Cleanup BPF selftests to remove old & unneeded rlimit code given kernel
switched to memcg-based memory accouting a while ago, from Yafang Shao.
16) Refactor of BPF sysctl handlers to move them to BPF core, from Yan Zhu.
17) Fix BPF selftests in two occasions to work around regressions caused by latest
LLVM to unblock CI until their fixes are worked out, from Yonghong Song.
18) Misc cleanups all over the place, from various others.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (85 commits)
selftests/bpf: Add libbpf's log fixup logic selftests
libbpf: Fix up verifier log for unguarded failed CO-RE relos
libbpf: Simplify bpf_core_parse_spec() signature
libbpf: Refactor CO-RE relo human description formatting routine
libbpf: Record subprog-resolved CO-RE relocations unconditionally
selftests/bpf: Add CO-RE relos and SEC("?...") to linked_funcs selftests
libbpf: Avoid joining .BTF.ext data with BPF programs by section name
libbpf: Fix logic for finding matching program for CO-RE relocation
libbpf: Drop unhelpful "program too large" guess
libbpf: Fix anonymous type check in CO-RE logic
bpf: Compute map_btf_id during build time
selftests/bpf: Add test for strict BTF type check
selftests/bpf: Add verifier tests for kptr
selftests/bpf: Add C tests for kptr
libbpf: Add kptr type tag macros to bpf_helpers.h
bpf: Make BTF type match stricter for release arguments
bpf: Teach verifier about kptr_get kfunc helpers
bpf: Wire up freeing of referenced kptr
bpf: Populate pairs of btf_id and destructor kfunc in btf
bpf: Adapt copy_map_value for multiple offset case
...
====================
Link: https://lore.kernel.org/r/20220427224758.20976-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Extend the PTP programmable pins to implement also PTP_PF_EXTTS
function. The PTP pin can be configured to capture only on the rising
edge of the PPS signal. And once an event is seen then an interrupt is
generated and the local time counter is saved.
The interrupt is shared between all the pins.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Lan966x has 8 PTP programmable pins, where the last pins is hardcoded to
be used by PHC0, which does the frame timestamping. All the rest of the
PTP pins can be shared between the PHCs and can have different functions
like perout or extts. For now add support for PTP_FS_PEROUT.
The HW is not able to support absolute start time but can use the nsec
for phase adjustment when generating PPS.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|