From e9942bfe493108bceb64a91c2a832412524e8b78 Mon Sep 17 00:00:00 2001 From: Marcin Szycik Date: Wed, 9 Oct 2024 17:18:35 +0200 Subject: ice: Fix use after free during unload with ports in bridge Unloading the ice driver while switchdev port representors are added to a bridge can lead to kernel panic. Reproducer: modprobe ice devlink dev eswitch set $PF1_PCI mode switchdev ip link add $BR type bridge ip link set $BR up echo 2 > /sys/class/net/$PF1/device/sriov_numvfs sleep 2 ip link set $PF1 master $BR ip link set $VF1_PR master $BR ip link set $VF2_PR master $BR ip link set $PF1 up ip link set $VF1_PR up ip link set $VF2_PR up ip link set $VF1 up rmmod irdma ice When unloading the driver, ice_eswitch_detach() is eventually called as part of VF freeing. First, it removes a port representor from xarray, then unregister_netdev() is called (via repr->ops.rem()), finally representor is deallocated. The problem comes from the bridge doing its own deinit at the same time. unregister_netdev() triggers a notifier chain, resulting in ice_eswitch_br_port_deinit() being called. It should set repr->br_port = NULL, but this does not happen since repr has already been removed from xarray and is not found. Regardless, it finishes up deallocating br_port. At this point, repr is still not freed and an fdb event can happen, in which ice_eswitch_br_fdb_event_work() takes repr->br_port and tries to use it, which causes a panic (use after free). Note that this only happens with 2 or more port representors added to the bridge, since with only one representor port, the bridge deinit is slightly different (ice_eswitch_br_port_deinit() is called via ice_eswitch_br_ports_flush(), not ice_eswitch_br_port_unlink()). Trace: Oops: general protection fault, probably for non-canonical address 0xf129010fd1a93284: 0000 [#1] PREEMPT SMP KASAN NOPTI KASAN: maybe wild-memory-access in range [0x8948287e8d499420-0x8948287e8d499427] (...) Workqueue: ice_bridge_wq ice_eswitch_br_fdb_event_work [ice] RIP: 0010:__rht_bucket_nested+0xb4/0x180 (...) Call Trace: (...) ice_eswitch_br_fdb_find+0x3fa/0x550 [ice] ? __pfx_ice_eswitch_br_fdb_find+0x10/0x10 [ice] ice_eswitch_br_fdb_event_work+0x2de/0x1e60 [ice] ? __schedule+0xf60/0x5210 ? mutex_lock+0x91/0xe0 ? __pfx_ice_eswitch_br_fdb_event_work+0x10/0x10 [ice] ? ice_eswitch_br_update_work+0x1f4/0x310 [ice] (...) A workaround is available: brctl setageing $BR 0, which stops the bridge from adding fdb entries altogether. Change the order of operations in ice_eswitch_detach(): move the call to unregister_netdev() before removing repr from xarray. This way repr->br_port will be correctly set to NULL in ice_eswitch_br_port_deinit(), preventing a panic. Fixes: fff292b47ac1 ("ice: add VF representors one by one") Reviewed-by: Michal Swiatkowski Reviewed-by: Paul Menzel Signed-off-by: Marcin Szycik Tested-by: Sujai Buvaneswaran Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/ice/ice_eswitch.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'drivers/net/ethernet/intel/ice') diff --git a/drivers/net/ethernet/intel/ice/ice_eswitch.c b/drivers/net/ethernet/intel/ice/ice_eswitch.c index c0b3e70a7ea3..fb527434b58b 100644 --- a/drivers/net/ethernet/intel/ice/ice_eswitch.c +++ b/drivers/net/ethernet/intel/ice/ice_eswitch.c @@ -552,13 +552,14 @@ int ice_eswitch_attach_sf(struct ice_pf *pf, struct ice_dynamic_port *sf) static void ice_eswitch_detach(struct ice_pf *pf, struct ice_repr *repr) { ice_eswitch_stop_reprs(pf); + repr->ops.rem(repr); + xa_erase(&pf->eswitch.reprs, repr->id); if (xa_empty(&pf->eswitch.reprs)) ice_eswitch_disable_switchdev(pf); ice_eswitch_release_repr(pf, repr); - repr->ops.rem(repr); ice_repr_destroy(repr); if (xa_empty(&pf->eswitch.reprs)) { -- cgit From 64502dac974a5d9951d16015fa2e16a14e5f2bb2 Mon Sep 17 00:00:00 2001 From: Mateusz Polchlopek Date: Mon, 28 Oct 2024 12:59:22 -0400 Subject: ice: change q_index variable type to s16 to store -1 value Fix Flow Director not allowing to re-map traffic to 0th queue when action is configured to drop (and vice versa). The current implementation of ethtool callback in the ice driver forbids change Flow Director action from 0 to -1 and from -1 to 0 with an error, e.g: # ethtool -U eth2 flow-type tcp4 src-ip 1.1.1.1 loc 1 action 0 # ethtool -U eth2 flow-type tcp4 src-ip 1.1.1.1 loc 1 action -1 rmgr: Cannot insert RX class rule: Invalid argument We set the value of `u16 q_index = 0` at the beginning of the function ice_set_fdir_input_set(). In case of "drop traffic" action (which is equal to -1 in ethtool) we store the 0 value. Later, when want to change traffic rule to redirect to queue with index 0 it returns an error caused by duplicate found. Fix this behaviour by change of the type of field `q_index` from u16 to s16 in `struct ice_fdir_fltr`. This allows to store -1 in the field in case of "drop traffic" action. What is more, change the variable type in the function ice_set_fdir_input_set() and assign at the beginning the new `#define ICE_FDIR_NO_QUEUE_IDX` which is -1. Later, if the action is set to another value (point specific queue index) the variable value is overwritten in the function. Fixes: cac2a27cd9ab ("ice: Support IPv4 Flow Director filters") Reviewed-by: Przemek Kitszel Signed-off-by: Mateusz Polchlopek Reviewed-by: Simon Horman Tested-by: Pucha Himasekhar Reddy (A Contingent worker at Intel) Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/ice/ice_ethtool_fdir.c | 3 ++- drivers/net/ethernet/intel/ice/ice_fdir.h | 4 +++- 2 files changed, 5 insertions(+), 2 deletions(-) (limited to 'drivers/net/ethernet/intel/ice') diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool_fdir.c b/drivers/net/ethernet/intel/ice/ice_ethtool_fdir.c index 5412eff8ef23..ee9862ddfe15 100644 --- a/drivers/net/ethernet/intel/ice/ice_ethtool_fdir.c +++ b/drivers/net/ethernet/intel/ice/ice_ethtool_fdir.c @@ -1830,11 +1830,12 @@ static int ice_set_fdir_input_set(struct ice_vsi *vsi, struct ethtool_rx_flow_spec *fsp, struct ice_fdir_fltr *input) { - u16 dest_vsi, q_index = 0; + s16 q_index = ICE_FDIR_NO_QUEUE_IDX; u16 orig_q_index = 0; struct ice_pf *pf; struct ice_hw *hw; int flow_type; + u16 dest_vsi; u8 dest_ctl; if (!vsi || !fsp || !input) diff --git a/drivers/net/ethernet/intel/ice/ice_fdir.h b/drivers/net/ethernet/intel/ice/ice_fdir.h index ab5b118daa2d..820023c0271f 100644 --- a/drivers/net/ethernet/intel/ice/ice_fdir.h +++ b/drivers/net/ethernet/intel/ice/ice_fdir.h @@ -53,6 +53,8 @@ */ #define ICE_FDIR_IPV4_PKT_FLAG_MF 0x20 +#define ICE_FDIR_NO_QUEUE_IDX -1 + enum ice_fltr_prgm_desc_dest { ICE_FLTR_PRGM_DESC_DEST_DROP_PKT, ICE_FLTR_PRGM_DESC_DEST_DIRECT_PKT_QINDEX, @@ -186,7 +188,7 @@ struct ice_fdir_fltr { u16 flex_fltr; /* filter control */ - u16 q_index; + s16 q_index; u16 orig_q_index; u16 dest_vsi; u8 dest_ctl; -- cgit