summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-12-01net: dpaa2-switch replace direct MAC access with dpaa2_switch_port_has_mac()Vladimir Oltean
The helper function will gain a lockdep annotation in a future patch. Make sure to benefit from it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01net: dpaa2: publish MAC stringset to ethtool -S even if MAC is missingVladimir Oltean
DPNIs and DPSW objects can connect and disconnect at runtime from DPMAC objects on the same fsl-mc bus. The DPMAC object also holds "ethtool -S" unstructured counters. Those counters are only shown for the entity owning the netdev (DPNI, DPSW) if it's connected to a DPMAC. The ethtool stringset code path is split into multiple callbacks, but currently, connecting and disconnecting the DPMAC takes the rtnl_lock(). This blocks the entire ethtool code path from running, see ethnl_default_doit() -> rtnl_lock() -> ops->prepare_data() -> strset_prepare_data(). This is going to be a problem if we are going to no longer require rtnl_lock() when connecting/disconnecting the DPMAC, because the DPMAC could appear between ops->get_sset_count() and ops->get_strings(). If it appears out of the blue, we will provide a stringset into an array that was dimensioned thinking the DPMAC wouldn't be there => array accessed out of bounds. There isn't really a good way to work around that, and I don't want to put too much pressure on the ethtool framework by playing locking games. Just make the DPMAC counters be always available. They'll be zeroes if the DPNI or DPSW isn't connected to a DPMAC. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01net: dpaa2-switch: assign port_priv->mac after dpaa2_mac_connect() callVladimir Oltean
The dpaa2-switch has the exact same locking requirements when connected to a DPMAC, so it needs port_priv->mac to always point either to NULL, or to a DPMAC with a fully initialized phylink instance. Make the same preparatory change in the dpaa2-switch driver as in the dpaa2-eth one. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01net: dpaa2-eth: assign priv->mac after dpaa2_mac_connect() callVladimir Oltean
There are 2 requirements for correct code: - Any time the driver accesses the priv->mac pointer at runtime, it either holds NULL to indicate a DPNI-DPNI connection (or unconnected DPNI), or a struct dpaa2_mac whose phylink instance was fully initialized (created and connected to the PHY). No changes are made to priv->mac while it is being used. Currently, rtnl_lock() watches over the call to dpaa2_eth_connect_mac(), so it serves the purpose of serializing this with all readers of priv->mac. - dpaa2_mac_connect() should run unlocked, because inside it are 2 phylink calls with incompatible locking requirements: phylink_create() requires that the rtnl_mutex isn't held, and phylink_fwnode_phy_connect() requires that the rtnl_mutex is held. The only way to solve those contradictory requirements is to let dpaa2_mac_connect() take rtnl_lock() when it needs to. To solve both requirements, we need to identify the writer side of the priv->mac pointer, which can be wrapped in a mutex private to the driver in a future patch. The dpaa2_mac_connect() cannot be part of the writer side critical section, because of an AB/BA deadlock with rtnl_lock(). So the strategy needs to be that where we prepare the DPMAC by calling dpaa2_mac_connect(), and only make priv->mac point to it once it's fully prepared. This ensures that the writer side critical section has the absolute minimum surface it can. The reverse strategy is adopted in the dpaa2_eth_disconnect_mac() code path. This makes sure that priv->mac is NULL when we start tearing down the DPMAC that we disconnected from, and concurrent code will simply not see it. No locking changes in this patch (concurrent code is still blocked by the rtnl_mutex). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01net: dpaa2-mac: remove defensive check in dpaa2_mac_disconnect()Vladimir Oltean
dpaa2_mac_disconnect() will only be called with a NULL mac->phylink if dpaa2_mac_connect() failed, or was never called. The callers are these: dpaa2_eth_disconnect_mac(): if (dpaa2_eth_is_type_phy(priv)) dpaa2_mac_disconnect(priv->mac); dpaa2_switch_port_disconnect_mac(): if (dpaa2_switch_port_is_type_phy(port_priv)) dpaa2_mac_disconnect(port_priv->mac); priv->mac can be NULL, but in that case, dpaa2_eth_is_type_phy() returns false, and dpaa2_mac_disconnect() is never called. Similar for dpaa2-switch. When priv->mac is non-NULL, it means that dpaa2_mac_connect() returned zero (success), and therefore, priv->mac->phylink is also a valid pointer. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01net: dpaa2-mac: absorb phylink_start() call into dpaa2_mac_start()Vladimir Oltean
The phylink handling is intended to be hidden inside the dpaa2_mac object. Move the phylink_start() call into dpaa2_mac_start(), and phylink_stop() into dpaa2_mac_stop(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01net: dpaa2: replace dpaa2_mac_is_type_fixed() with dpaa2_mac_is_type_phy()Vladimir Oltean
dpaa2_mac_is_type_fixed() is a header with no implementation and no callers, which is referenced from the documentation though. It can be deleted. On the other hand, it would be useful to reuse the code between dpaa2_eth_is_type_phy() and dpaa2_switch_port_is_type_phy(). That common code should be called dpaa2_mac_is_type_phy(), so let's create that. The removal and the addition are merged into the same patch because, in fact, is_type_phy() is the logical opposite of is_type_fixed(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01net: dpaa2-eth: don't use -ENOTSUPP error codeVladimir Oltean
dpaa2_eth_setup_dpni() is called from the probe path and dpaa2_eth_set_link_ksettings() is propagated to user space. include/linux/errno.h says that ENOTSUPP is "Defined for the NFSv3 protocol". Conventional wisdom has it to not use it in networking drivers. Replace it with -EOPNOTSUPP. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01inet: ping: use hlist_nulls rcu iterator during lookupFlorian Westphal
ping_lookup() does not acquire the table spinlock, so iteration should use hlist_nulls_for_each_entry_rcu(). Spotted during code review. Fixes: dbca1596bbb0 ("ping: convert to RCU lookups, get rid of rwlock") Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/20221129140644.28525-1-fw@strlen.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01net: microchip: sparx5: Fix error handling in vcap_show_admin()Dan Carpenter
If vcap_dup_rule() fails that leads to an error pointer dereference side the call to vcap_free_rule(). Also it only returns an error if the very last call to vcap_read_rule() fails and it returns success for other errors. I've changed it to just stop printing after the first error and return an error code. Fixes: 3a7921560d2f ("net: microchip: sparx5: Add VCAP rule debugFS support for the VCAP API") Signed-off-by: Dan Carpenter <error27@gmail.com> Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Link: https://lore.kernel.org/r/Y4XUUx9kzurBN+BV@kili Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01Revert "clocksource/drivers/riscv: Events are stopped during CPU suspend"Conor Dooley
This reverts commit 232ccac1bd9b5bfe73895f527c08623e7fa0752d. On the subject of suspend, the RISC-V SBI spec states: This does not cover whether any given events actually reach the hart or not, just what the hart will do if it receives an event. On PolarFire SoC, and potentially other SiFive based implementations, events from the RISC-V timer do reach a hart during suspend. This is not the case for the implementation on the Allwinner D1 - there timer events are not received during suspend. To fix this, the CLOCK_EVT_FEAT_C3STOP (mis)feature was enabled for the timer driver - but this has broken both RCU stall detection and timers generally on PolarFire SoC and potentially other SiFive based implementations. If an AXI read to the PCIe controller on PolarFire SoC times out, the system will stall, however, with CLOCK_EVT_FEAT_C3STOP active, the system just locks up without RCU stalling: io scheduler mq-deadline registered io scheduler kyber registered microchip-pcie 2000000000.pcie: host bridge /soc/pcie@2000000000 ranges: microchip-pcie 2000000000.pcie: MEM 0x2008000000..0x2087ffffff -> 0x0008000000 microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer microchip-pcie 2000000000.pcie: axi read request error microchip-pcie 2000000000.pcie: axi read timeout microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer Freeing initrd memory: 7332K Similarly issues were reported with clock_nanosleep() - with a test app that sleeps each cpu for 6, 5, 4, 3 ms respectively, HZ=250 & the blamed commit in place, the sleep times are rounded up to the next jiffy: == CPU: 1 == == CPU: 2 == == CPU: 3 == == CPU: 4 == Mean: 7.974992 Mean: 7.976534 Mean: 7.962591 Mean: 3.952179 Std Dev: 0.154374 Std Dev: 0.156082 Std Dev: 0.171018 Std Dev: 0.076193 Hi: 9.472000 Hi: 10.495000 Hi: 8.864000 Hi: 4.736000 Lo: 6.087000 Lo: 6.380000 Lo: 4.872000 Lo: 3.403000 Samples: 521 Samples: 521 Samples: 521 Samples: 521 Fortunately, the D1 has a second timer, which is "currently used in preference to the RISC-V/SBI timer driver" so a revert here does not hurt operation of D1 in its current form. Ultimately, a DeviceTree property (or node) will be added to encode the behaviour of the timers, but until then revert the addition of CLOCK_EVT_FEAT_C3STOP. Fixes: 232ccac1bd9b ("clocksource/drivers/riscv: Events are stopped during CPU suspend") Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Acked-by: Samuel Holland <samuel@sholland.org> Link: https://lore.kernel.org/linux-riscv/YzYTNQRxLr7Q9JR0@spud/ Link: https://github.com/riscv-non-isa/riscv-sbi-doc/issues/98/ Link: https://lore.kernel.org/linux-riscv/bf6d3b1f-f703-4a25-833e-972a44a04114@sholland.org/ Link: https://lore.kernel.org/r/20221122121620.3522431-1-conor.dooley@microchip.com
2022-12-01wifi: rtw89: link rtw89_vif and chanctx stuffsZong-Zhe Yang
First, introduce struct rtw89_sub_entity for chanctx related stuffs. Second, add enum rtw89_sub_entity_idx to rtw89_vif for vif operation to access its/right chanctx stuffs after future multi-channel support. Besides, RTW89_SUB_ENTITY_0 is the default chanctx entry throughout driver, i.e. it's used for things which may not have a target chanctx yet. So, we need to ensure that RTW89_SUB_ENTITY_0 is always working. If there is at least one alive chanctx, then one of them must take RTW89_SUB_ENTITY_0. If no alive chanctx, RTW89_SUB_ENTITY_0 will be filled by rtw89_config_default_chandef(). Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221129083130.45708-7-pkshih@realtek.com
2022-12-01wifi: rtw89: fw: implement MCC related H2CZong-Zhe Yang
These MCC H2C(s) require to wait for MCC C2H to determine if the execution is successful. Through rtw89_wait_for_cond(), we make them wait for either a completion with data from MCC C2H handlers, which calls rtw89_complete_cond(), or timeout. Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221129083130.45708-6-pkshih@realtek.com
2022-12-01wifi: rtw89: mac: process MCC related C2HZong-Zhe Yang
Process C2H(s) related to MCC (multi-channel concurrency). These handling, which either call rtw89_complete_cond() or show message in debug mode, can be considered atomic/lock-free. So, they should be safe to be processed directly after C2H pre-check in previous patch. Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221129083130.45708-5-pkshih@realtek.com
2022-12-01wifi: rtw89: introduce helpers to wait/complete on conditionZong-Zhe Yang
MCC (multi-channel concurrency) related H2Cs (host to chip commands) require to wait for C2H (chip to host events) responses to judge the execution result and data. We introduce helpers to assist this process. Besides, we would like the helpers to be generic for use in driver even outside of MCC H2C/C2H, so we make a independent patch for them. In the following, I describe the things first. ``` (A) C2H is generated by FW, and then transferred upto driver. Hence, driver cannot get it immediately without a bit waitting/blocking. For this, we choose to use wait_for_completion_*() instead of busy polling. (B) From the driver management perspective, a scenario, e.g. MCC, may have mulitple kind of H2C functions requiring this process to wait for corresponding C2Hs. But, the driver management flow uses mutex to protect each behavior. So, one scenario triggers one H2C function at one time. To avoid rampant instances of struct completion for each H2C function, we choose to use one struct completion with one condition flag for one scenario. (C) C2Hs, which H2Cs will be waitting for, cannot be ordered with driver management flow, i.e. cannot enqueue work to the same ordered workqueue and cannot lock by the same mutex, to prevent H2C side from getting no C2H responses. So, those C2Hs are parsed in interrupt context directly as done in previous commit. (D) Following (C), the above underline H2Cs and C2Hs will be handled in different contexts without sync. So, we use atomic_cmpxchg() to compare and change the condition in atomic. ``` So, we introduce struct rtw89_wait_info which combines struct completion and atomic_t. Then, the below are the descriptions for helper functions. * rtw89_wait_for_cond() to wait for a completion based on a condition. * rtw89_complete_cond() to complete a given condition and carry data. Each rtw89_wait_info instance independently determines the meaning of its waitting conditions. But, RTW89_WAIT_COND_IDLE (UINT_MAX) is reserved. Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221129083130.45708-4-pkshih@realtek.com
2022-12-01wifi: rtw89: check if atomic before queuing c2hZong-Zhe Yang
Before queuing C2H work, we check atomicity of the C2H's handler first now. If atomic or lock-free, handle it directly; otherwise, handle it with mutex in work as previous. This prepares for MAC MCC C2Hs which require to be processed directly. And, their handlers will be functions which can be considered atomic. Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221129083130.45708-3-pkshih@realtek.com
2022-12-01wifi: rtw89: rfk: rename rtw89_mcc_info to rtw89_rfk_mcc_infoZong-Zhe Yang
The `rtw89_mcc_info mcc` is only for RFK MCC stuffs instead of common MCC management info. Replace it with `rtw89_rfk_mcc_info rfk_mcc` to avoid confusion and reserve `struct rtw89_mcc_info mcc` for MCC management code. (No logic changes.) Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221129083130.45708-2-pkshih@realtek.com
2022-12-01wifi: rtw88: 8821c: enable BT device recovery mechanismPing-Ke Shih
8821ce is a combo card, and BT is a USB device that could get card lost during stress test, and need WiFi firmware to detect and recover it, so driver sends a H2C to enable this mechanism. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221128075653.5221-1-pkshih@realtek.com
2022-12-01wifi: rtw89: 8852b: turn off PoP function in monitor modePing-Ke Shih
PoP stands for Packet on Packet that can improve performance in noisy environment, but it could get RX stuck suddenly. In normal mode, firmware can help to resolve the stuck, but firmware doesn't work in monitor mode. Therefore, turn off PoP to avoid RX stuck. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221125072416.94752-4-pkshih@realtek.com
2022-12-01wifi: rtw89: add HE radiotap for monitor modePing-Ke Shih
With basic HE radiotap, we can check data rate in sniffer data. To store the radiotap data, we reserve headroom of aligned 64 bytes, and then update HE radiotap in monitor mode, so it doesn't affect performance in normal mode. Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221125072416.94752-3-pkshih@realtek.com
2022-12-01wifi: rtw89: enable mac80211 virtual monitor interfaceZong-Zhe Yang
For running with mac80211 channel context ops and using only as monitor, we need to enable WANT_MONITOR_VIF to let mac80211 process virtual monitor interface. Then, we are able to set channel on the monitor from user space. Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221125072416.94752-2-pkshih@realtek.com
2022-12-01wifi: brcmfmac: Check the count value of channel spec to prevent ↵Minsuk Kang
out-of-bounds reads This patch fixes slab-out-of-bounds reads in brcmfmac that occur in brcmf_construct_chaninfo() and brcmf_enable_bw40_2g() when the count value of channel specifications provided by the device is greater than the length of 'list->element[]', decided by the size of the 'list' allocated with kzalloc(). The patch adds checks that make the functions free the buffer and return -EINVAL if that is the case. Note that the negative return is handled by the caller, brcmf_setup_wiphybands() or brcmf_cfg80211_attach(). Found by a modified version of syzkaller. Crash Report from brcmf_construct_chaninfo(): ================================================================== BUG: KASAN: slab-out-of-bounds in brcmf_setup_wiphybands+0x1238/0x1430 Read of size 4 at addr ffff888115f24600 by task kworker/0:2/1896 CPU: 0 PID: 1896 Comm: kworker/0:2 Tainted: G W O 5.14.0+ #132 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Workqueue: usb_hub_wq hub_event Call Trace: dump_stack_lvl+0x57/0x7d print_address_description.constprop.0.cold+0x93/0x334 kasan_report.cold+0x83/0xdf brcmf_setup_wiphybands+0x1238/0x1430 brcmf_cfg80211_attach+0x2118/0x3fd0 brcmf_attach+0x389/0xd40 brcmf_usb_probe+0x12de/0x1690 usb_probe_interface+0x25f/0x710 really_probe+0x1be/0xa90 __driver_probe_device+0x2ab/0x460 driver_probe_device+0x49/0x120 __device_attach_driver+0x18a/0x250 bus_for_each_drv+0x123/0x1a0 __device_attach+0x207/0x330 bus_probe_device+0x1a2/0x260 device_add+0xa61/0x1ce0 usb_set_configuration+0x984/0x1770 usb_generic_driver_probe+0x69/0x90 usb_probe_device+0x9c/0x220 really_probe+0x1be/0xa90 __driver_probe_device+0x2ab/0x460 driver_probe_device+0x49/0x120 __device_attach_driver+0x18a/0x250 bus_for_each_drv+0x123/0x1a0 __device_attach+0x207/0x330 bus_probe_device+0x1a2/0x260 device_add+0xa61/0x1ce0 usb_new_device.cold+0x463/0xf66 hub_event+0x10d5/0x3330 process_one_work+0x873/0x13e0 worker_thread+0x8b/0xd10 kthread+0x379/0x450 ret_from_fork+0x1f/0x30 Allocated by task 1896: kasan_save_stack+0x1b/0x40 __kasan_kmalloc+0x7c/0x90 kmem_cache_alloc_trace+0x19e/0x330 brcmf_setup_wiphybands+0x290/0x1430 brcmf_cfg80211_attach+0x2118/0x3fd0 brcmf_attach+0x389/0xd40 brcmf_usb_probe+0x12de/0x1690 usb_probe_interface+0x25f/0x710 really_probe+0x1be/0xa90 __driver_probe_device+0x2ab/0x460 driver_probe_device+0x49/0x120 __device_attach_driver+0x18a/0x250 bus_for_each_drv+0x123/0x1a0 __device_attach+0x207/0x330 bus_probe_device+0x1a2/0x260 device_add+0xa61/0x1ce0 usb_set_configuration+0x984/0x1770 usb_generic_driver_probe+0x69/0x90 usb_probe_device+0x9c/0x220 really_probe+0x1be/0xa90 __driver_probe_device+0x2ab/0x460 driver_probe_device+0x49/0x120 __device_attach_driver+0x18a/0x250 bus_for_each_drv+0x123/0x1a0 __device_attach+0x207/0x330 bus_probe_device+0x1a2/0x260 device_add+0xa61/0x1ce0 usb_new_device.cold+0x463/0xf66 hub_event+0x10d5/0x3330 process_one_work+0x873/0x13e0 worker_thread+0x8b/0xd10 kthread+0x379/0x450 ret_from_fork+0x1f/0x30 The buggy address belongs to the object at ffff888115f24000 which belongs to the cache kmalloc-2k of size 2048 The buggy address is located 1536 bytes inside of 2048-byte region [ffff888115f24000, ffff888115f24800) Memory state around the buggy address: ffff888115f24500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff888115f24580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff888115f24600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff888115f24680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff888115f24700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ================================================================== Crash Report from brcmf_enable_bw40_2g(): ================================================================== BUG: KASAN: slab-out-of-bounds in brcmf_cfg80211_attach+0x3d11/0x3fd0 Read of size 4 at addr ffff888103787600 by task kworker/0:2/1896 CPU: 0 PID: 1896 Comm: kworker/0:2 Tainted: G W O 5.14.0+ #132 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Workqueue: usb_hub_wq hub_event Call Trace: dump_stack_lvl+0x57/0x7d print_address_description.constprop.0.cold+0x93/0x334 kasan_report.cold+0x83/0xdf brcmf_cfg80211_attach+0x3d11/0x3fd0 brcmf_attach+0x389/0xd40 brcmf_usb_probe+0x12de/0x1690 usb_probe_interface+0x25f/0x710 really_probe+0x1be/0xa90 __driver_probe_device+0x2ab/0x460 driver_probe_device+0x49/0x120 __device_attach_driver+0x18a/0x250 bus_for_each_drv+0x123/0x1a0 __device_attach+0x207/0x330 bus_probe_device+0x1a2/0x260 device_add+0xa61/0x1ce0 usb_set_configuration+0x984/0x1770 usb_generic_driver_probe+0x69/0x90 usb_probe_device+0x9c/0x220 really_probe+0x1be/0xa90 __driver_probe_device+0x2ab/0x460 driver_probe_device+0x49/0x120 __device_attach_driver+0x18a/0x250 bus_for_each_drv+0x123/0x1a0 __device_attach+0x207/0x330 bus_probe_device+0x1a2/0x260 device_add+0xa61/0x1ce0 usb_new_device.cold+0x463/0xf66 hub_event+0x10d5/0x3330 process_one_work+0x873/0x13e0 worker_thread+0x8b/0xd10 kthread+0x379/0x450 ret_from_fork+0x1f/0x30 Allocated by task 1896: kasan_save_stack+0x1b/0x40 __kasan_kmalloc+0x7c/0x90 kmem_cache_alloc_trace+0x19e/0x330 brcmf_cfg80211_attach+0x3302/0x3fd0 brcmf_attach+0x389/0xd40 brcmf_usb_probe+0x12de/0x1690 usb_probe_interface+0x25f/0x710 really_probe+0x1be/0xa90 __driver_probe_device+0x2ab/0x460 driver_probe_device+0x49/0x120 __device_attach_driver+0x18a/0x250 bus_for_each_drv+0x123/0x1a0 __device_attach+0x207/0x330 bus_probe_device+0x1a2/0x260 device_add+0xa61/0x1ce0 usb_set_configuration+0x984/0x1770 usb_generic_driver_probe+0x69/0x90 usb_probe_device+0x9c/0x220 really_probe+0x1be/0xa90 __driver_probe_device+0x2ab/0x460 driver_probe_device+0x49/0x120 __device_attach_driver+0x18a/0x250 bus_for_each_drv+0x123/0x1a0 __device_attach+0x207/0x330 bus_probe_device+0x1a2/0x260 device_add+0xa61/0x1ce0 usb_new_device.cold+0x463/0xf66 hub_event+0x10d5/0x3330 process_one_work+0x873/0x13e0 worker_thread+0x8b/0xd10 kthread+0x379/0x450 ret_from_fork+0x1f/0x30 The buggy address belongs to the object at ffff888103787000 which belongs to the cache kmalloc-2k of size 2048 The buggy address is located 1536 bytes inside of 2048-byte region [ffff888103787000, ffff888103787800) Memory state around the buggy address: ffff888103787500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff888103787580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff888103787600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff888103787680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff888103787700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ================================================================== Reported-by: Dokyung Song <dokyungs@yonsei.ac.kr> Reported-by: Jisoo Jang <jisoo.jang@yonsei.ac.kr> Reported-by: Minsuk Kang <linuxlovemin@yonsei.ac.kr> Reviewed-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Minsuk Kang <linuxlovemin@yonsei.ac.kr> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221116142952.518241-1-linuxlovemin@yonsei.ac.kr
2022-12-01mmc: sdhci-sprd: Fix no reset data and command after voltage switchWenchao Chen
After switching the voltage, no reset data and command will cause CMD2 timeout. Fixes: 29ca763fc26f ("mmc: sdhci-sprd: Add pin control support for voltage switch") Signed-off-by: Wenchao Chen <wenchao.chen@unisoc.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221130121328.25553-1-wenchao.chen@unisoc.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-12-01bonding: uninitialized variable in bond_miimon_inspect()Dan Carpenter
The "ignore_updelay" variable needs to be initialized to false. Fixes: f8a65ab2f3ff ("bonding: fix link recovery in mode 2 when updelay is nonzero") Signed-off-by: Dan Carpenter <error27@gmail.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Link: https://lore.kernel.org/r/Y4SWJlh3ohJ6EPTL@kili Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01Merge branch 'af_unix-fix-a-null-deref-in-sk_diag_dump_uid'Paolo Abeni
Kuniyuki Iwashima says: ==================== af_unix: Fix a NULL deref in sk_diag_dump_uid(). The first patch fixes a NULL deref when we dump a AF_UNIX socket's UID, and the second patch adds a repro/test for such a case. ==================== Link: https://lore.kernel.org/r/20221127012412.37969-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01af_unix: Add test for sock_diag and UDIAG_SHOW_UID.Kuniyuki Iwashima
The test prog dumps a single AF_UNIX socket's UID with and without unshare(CLONE_NEWUSER) and checks if it matches the result of getuid(). Without the preceding patch, the test prog is killed by a NULL deref in sk_diag_dump_uid(). # ./diag_uid TAP version 13 1..2 # Starting 2 tests from 3 test cases. # RUN diag_uid.uid.1 ... BUG: kernel NULL pointer dereference, address: 0000000000000270 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 105212067 P4D 105212067 PUD 1051fe067 PMD 0 Oops: 0000 [#1] PREEMPT SMP NOPTI Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.amzn2022.0.1 04/01/2014 RIP: 0010:sk_diag_fill (./include/net/sock.h:920 net/unix/diag.c:119 net/unix/diag.c:170) ... # 1: Test terminated unexpectedly by signal 9 # FAIL diag_uid.uid.1 not ok 1 diag_uid.uid.1 # RUN diag_uid.uid_unshare.1 ... # 1: Test terminated by timeout # FAIL diag_uid.uid_unshare.1 not ok 2 diag_uid.uid_unshare.1 # FAILED: 0 / 2 tests passed. # Totals: pass:0 fail:2 xfail:0 xpass:0 skip:0 error:0 With the patch, the test succeeds. # ./diag_uid TAP version 13 1..2 # Starting 2 tests from 3 test cases. # RUN diag_uid.uid.1 ... # OK diag_uid.uid.1 ok 1 diag_uid.uid.1 # RUN diag_uid.uid_unshare.1 ... # OK diag_uid.uid_unshare.1 ok 2 diag_uid.uid_unshare.1 # PASSED: 2 / 2 tests passed. # Totals: pass:2 fail:0 xfail:0 xpass:0 skip:0 error:0 Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01af_unix: Get user_ns from in_skb in unix_diag_get_exact().Kuniyuki Iwashima
Wei Chen reported a NULL deref in sk_user_ns() [0][1], and Paolo diagnosed the root cause: in unix_diag_get_exact(), the newly allocated skb does not have sk. [2] We must get the user_ns from the NETLINK_CB(in_skb).sk and pass it to sk_diag_fill(). [0]: BUG: kernel NULL pointer dereference, address: 0000000000000270 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 12bbce067 P4D 12bbce067 PUD 12bc40067 PMD 0 Oops: 0000 [#1] PREEMPT SMP CPU: 0 PID: 27942 Comm: syz-executor.0 Not tainted 6.1.0-rc5-next-20221118 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014 RIP: 0010:sk_user_ns include/net/sock.h:920 [inline] RIP: 0010:sk_diag_dump_uid net/unix/diag.c:119 [inline] RIP: 0010:sk_diag_fill+0x77d/0x890 net/unix/diag.c:170 Code: 89 ef e8 66 d4 2d fd c7 44 24 40 00 00 00 00 49 8d 7c 24 18 e8 54 d7 2d fd 49 8b 5c 24 18 48 8d bb 70 02 00 00 e8 43 d7 2d fd <48> 8b 9b 70 02 00 00 48 8d 7b 10 e8 33 d7 2d fd 48 8b 5b 10 48 8d RSP: 0018:ffffc90000d67968 EFLAGS: 00010246 RAX: ffff88812badaa48 RBX: 0000000000000000 RCX: ffffffff840d481d RDX: 0000000000000465 RSI: 0000000000000000 RDI: 0000000000000270 RBP: ffffc90000d679a8 R08: 0000000000000277 R09: 0000000000000000 R10: 0001ffffffffffff R11: 0001c90000d679a8 R12: ffff88812ac03800 R13: ffff88812c87c400 R14: ffff88812ae42210 R15: ffff888103026940 FS: 00007f08b4e6f700(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000270 CR3: 000000012c58b000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> unix_diag_get_exact net/unix/diag.c:285 [inline] unix_diag_handler_dump+0x3f9/0x500 net/unix/diag.c:317 __sock_diag_cmd net/core/sock_diag.c:235 [inline] sock_diag_rcv_msg+0x237/0x250 net/core/sock_diag.c:266 netlink_rcv_skb+0x13e/0x250 net/netlink/af_netlink.c:2564 sock_diag_rcv+0x24/0x40 net/core/sock_diag.c:277 netlink_unicast_kernel net/netlink/af_netlink.c:1330 [inline] netlink_unicast+0x5e9/0x6b0 net/netlink/af_netlink.c:1356 netlink_sendmsg+0x739/0x860 net/netlink/af_netlink.c:1932 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg net/socket.c:734 [inline] ____sys_sendmsg+0x38f/0x500 net/socket.c:2476 ___sys_sendmsg net/socket.c:2530 [inline] __sys_sendmsg+0x197/0x230 net/socket.c:2559 __do_sys_sendmsg net/socket.c:2568 [inline] __se_sys_sendmsg net/socket.c:2566 [inline] __x64_sys_sendmsg+0x42/0x50 net/socket.c:2566 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x4697f9 Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f08b4e6ec48 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 000000000077bf80 RCX: 00000000004697f9 RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 0000000000000003 RBP: 00000000004d29e9 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 000000000077bf80 R13: 0000000000000000 R14: 000000000077bf80 R15: 00007ffdb36bc6c0 </TASK> Modules linked in: CR2: 0000000000000270 [1]: https://lore.kernel.org/netdev/CAO4mrfdvyjFpokhNsiwZiP-wpdSD0AStcJwfKcKQdAALQ9_2Qw@mail.gmail.com/ [2]: https://lore.kernel.org/netdev/e04315e7c90d9a75613f3993c2baf2d344eef7eb.camel@redhat.com/ Fixes: cae9910e7344 ("net: Add UNIX_DIAG_UID to Netlink UNIX socket diagnostics.") Reported-by: syzbot <syzkaller@googlegroups.com> Reported-by: Wei Chen <harperchen1110@gmail.com> Diagnosed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-30Merge tag 'mlx5-updates-2022-11-29' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2022-11-29 Misc update for mlx5 driver 1) Various trivial cleanups 2) Maor Dickman, Adds support for trap offload with additional actions 3) From Tariq, UMR (device memory registrations) cleanups, UMR WQE must be aligned to 64B per device spec, (not a bug fix). * tag 'mlx5-updates-2022-11-29' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: Support devlink reload of IPsec core net/mlx5e: TC, Add offload support for trap with additional actions net/mlx5e: Do early return when setup vports dests for slow path flow net/mlx5: Remove redundant check net/mlx5e: Delete always true DMA check net/mlx5e: Don't access directly DMA device pointer net/mlx5e: Don't use termination table when redundant net/mlx5: Fix orthography errors in documentation net/mlx5: Use generic definition for UMR KLM alignment net/mlx5: Generalize name of UMR alignment definition net/mlx5: Remove unused UMR MTT definitions net/mlx5e: Add padding when needed in UMR WQEs net/mlx5: Remove unused ctx variables net/mlx5e: Replace zero-length arrays with DECLARE_FLEX_ARRAY() helper net/mlx5e: Remove unneeded io-mapping.h #include ==================== Link: https://lore.kernel.org/r/20221130051152.479480-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30net: phy: Add link between phy dev and mac devXiaolei Wang
If the external phy used by current mac interface is managed by another mac interface, it means that this network port cannot work independently, especially when the system suspends and resumes, the following trace may appear, so we should create a device link between phy dev and mac dev. WARNING: CPU: 0 PID: 24 at drivers/net/phy/phy.c:983 phy_error+0x20/0x68 Modules linked in: CPU: 0 PID: 24 Comm: kworker/0:2 Not tainted 6.1.0-rc3-00011-g5aaef24b5c6d-dirty #34 Hardware name: Freescale i.MX6 SoloX (Device Tree) Workqueue: events_power_efficient phy_state_machine unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x68/0x90 dump_stack_lvl from __warn+0xb4/0x24c __warn from warn_slowpath_fmt+0x5c/0xd8 warn_slowpath_fmt from phy_error+0x20/0x68 phy_error from phy_state_machine+0x22c/0x23c phy_state_machine from process_one_work+0x288/0x744 process_one_work from worker_thread+0x3c/0x500 worker_thread from kthread+0xf0/0x114 kthread from ret_from_fork+0x14/0x28 Exception stack(0xf0951fb0 to 0xf0951ff8) Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20221130021216.1052230-1-xiaolei.wang@windriver.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nfJakub Kicinski
Pablo Neira Ayuso says: ==================== Netfilter fixes for net 1) Check for interval validity in all concatenation fields in nft_set_pipapo, from Stefano Brivio. 2) Missing preemption disabled in conntrack and flowtable stat updates, from Xin Long. 3) Fix compilation warning when CONFIG_NF_CONNTRACK_MARK=n. Except for 3) which was a bug introduced in a recent fix in 6.1-rc - anything else, broken for several releases. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: ctnetlink: fix compilation warning after data race fixes in ct mark netfilter: conntrack: fix using __this_cpu_add in preemptible netfilter: flowtable_offload: fix using __this_cpu_add in preemptible netfilter: nft_set_pipapo: Actually validate intervals in fields after the first one ==================== Link: https://lore.kernel.org/r/20221130121934.1125-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30Merge branch 'net-devlink-return-the-driver-name-in-devlink_nl_info_fill'Jakub Kicinski
Vincent Mailhol says: ==================== net: devlink: return the driver name in devlink_nl_info_fill The driver name is available in device_driver::name. Right now, drivers still have to report this piece of information themselves in their devlink_ops::info_get callback function. The goal of this series is to have the devlink core to report this information instead of the drivers. The first patch fulfills the actual goal of this series: modify devlink core to report the driver name and clean-up all drivers. Both have to be done in an atomic change to avoid attribute duplication. This same patch also removes the devlink_info_driver_name_put() function to prevent future drivers from reporting the driver name themselves. The second patch allows the core to call devlink_nl_info_fill() even if the devlink_ops::info_get() callback is NULL. This leads to the third and final patch which cleans up the drivers which have an empty info_get(). ==================== Link: https://lore.kernel.org/r/20221129095140.3913303-1-mailhol.vincent@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30net: devlink: clean-up empty devlink_ops::info_get()Vincent Mailhol
devlink_ops::info_get() is now optional and devlink will continue to report information even if that callback gets removed. Remove all the empty devlink_ops::info_get() callbacks from the drivers. Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30net: devlink: make the devlink_ops::info_get() callback optionalVincent Mailhol
Some drivers only reported the driver name in their devlink_ops::info_get() callback. Now that the core provides this information, the callback became empty. For such drivers, just removing the callback would prevent the core from executing devlink_nl_info_fill() meaning that "devlink dev info" would not return anything. Make the callback function optional by executing devlink_nl_info_fill() even if devlink_ops::info_get() is NULL. N.B.: the drivers with devlink support which previously did not implement devlink_ops::info_get() will now also be able to report the driver name. Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30net: devlink: let the core report the driver name instead of the driversVincent Mailhol
The driver name is available in device_driver::name. Right now, drivers still have to report this piece of information themselves in their devlink_ops::info_get callback function. In order to factorize code, make devlink_nl_info_fill() add the driver name attribute. Now that the core sets the driver name attribute, drivers are not supposed to call devlink_info_driver_name_put() anymore. Remove devlink_info_driver_name_put() and clean-up all the drivers using this function in their callback. Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Tested-by: Ido Schimmel <idosch@nvidia.com> # mlxsw Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30net: ethernet: ti: am65-cpsw: Fix RGMII configuration at SPEED_10Siddharth Vadapalli
The am65-cpsw driver supports configuring all RGMII variants at interface speed of 10 Mbps. However, in the process of shifting to the PHYLINK framework, the support for all variants of RGMII except the PHY_INTERFACE_MODE_RGMII variant was accidentally removed. Fix this by using phy_interface_mode_is_rgmii() to check for all variants of RGMII mode. Fixes: e8609e69470f ("net: ethernet: ti: am65-cpsw: Convert to PHYLINK") Reported-by: Schuyler Patton <spatton@ti.com> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Link: https://lore.kernel.org/r/20221129050639.111142-1-s-vadapalli@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30dt-bindings: nfc: nxp,nci: Document NQ310 compatibleLuca Weiss
The NQ310 is another NFC chip from NXP, document the compatible in the bindings. Signed-off-by: Luca Weiss <luca@z3ntu.xyz> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20221128173744.833018-1-luca@z3ntu.xyz Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30Merge branch 'support-direct-read-from-region'Jakub Kicinski
Jacob Keller says: ==================== support direct read from region A long time ago when initially implementing devlink regions in ice I proposed the ability to allow reading from a region without taking a snapshot [1]. I eventually dropped this work from the original series due to size. Then I eventually lost track of submitting this follow up. This can be useful when interacting with some region that has some definitive "contents" from which snapshots are made. For example the ice driver has regions representing the contents of the device flash. If userspace wants to read the contents today, it must first take a snapshot and then read from that snapshot. This makes sense if you want to read a large portion of data or you want to be sure reads are consistently from the same recording of the flash. However if user space only wants to read a small chunk, it must first generate a snapshot of the entire contents, perform a read from the snapshot, and then delete the snapshot after reading. For such a use case, a direct read from the region makes more sense. This can be achieved by allowing the devlink region read command to work without a snapshot. Instead the portion to be read can be forwarded directly to the driver via a new .read callback. This avoids the need to read the entire region contents into memory first and avoids the software overhead of creating a snapshot and then deleting it. This series implements such behavior and hooks up the ice NVM and shadow RAM regions to allow it. [1] https://lore.kernel.org/netdev/20200130225913.1671982-1-jacob.e.keller@intel.com/ ==================== Link: https://lore.kernel.org/r/20221128203647.1198669-1-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30ice: implement direct read for NVM and Shadow RAM regionsJacob Keller
Implement the .read handler for the NVM and Shadow RAM regions. This enables user space to read a small chunk of the flash without needing the overhead of creating a full snapshot. Update the documentation for ice to detail which regions have direct read support. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30ice: document 'shadow-ram' devlink regionJacob Keller
78ad87da9978 ("ice: devlink: add shadow-ram region to snapshot Shadow RAM") added support for the 'shadow-ram' devlink region, but did not document it in the ice devlink documentation. Fix this. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30ice: use same function to snapshot both NVM and Shadow RAMJacob Keller
The ice driver supports a region for both the flat NVM contents as well as the Shadow RAM which is a layer built on top of the flash during device initialization. These regions use an almost identical read function, except that the NVM needs to set the direct flag when reading, while Shadow RAM needs to read without the direct flag set. They each call ice_read_flat_nvm with the only difference being whether to set the direct flash flag. The NVM region read function also was fixed to read the NVM in blocks to avoid a situation where the firmware reclaims the lock due to taking too long. Note that the region snapshot function takes the ops pointer so the function can easily determine which region to read. Make use of this and re-use the NVM snapshot function for both the NVM and Shadow RAM regions. This makes Shadow RAM benefit from the same block approach as the NVM region. It also reduces code in the ice driver. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30devlink: support directly reading from region memoryJacob Keller
To read from a region, user space must currently request a new snapshot of the region and then read from that snapshot. This can sometimes be overkill if user space only reads a tiny portion. They first create the snapshot, then request a read, then destroy the snapshot. For regions which have a single underlying "contents", it makes sense to allow supporting direct reading of the region data. Extend the DEVLINK_CMD_REGION_READ to allow direct reading from a region if requested via the new DEVLINK_ATTR_REGION_DIRECT. If this attribute is set, then perform a direct read instead of using a snapshot. Direct read is mutually exclusive with DEVLINK_ATTR_REGION_SNAPSHOT_ID, and care is taken to ensure that we reject commands which provide incorrect attributes. Regions must enable support for direct read by implementing the .read() callback function. If a region does not support such direct reads, a suitable extended error message is reported. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30devlink: refactor region_read_snapshot_fill to use a callback functionJacob Keller
The devlink_nl_region_read_snapshot_fill is used to copy the contents of a snapshot into a message for reporting to userspace via the DEVLINK_CMG_REGION_READ netlink message. A future change is going to add support for directly reading from a region. Almost all of the logic for this new capability is identical. To help reduce code duplication and make this logic more generic, refactor the function to take a cb and cb_priv pointer for doing the actual copy. Add a devlink_region_snapshot_fill implementation that will simply copy the relevant chunk of the region. This does require allocating some storage for the chunk as opposed to simply passing the correct address forward to the devlink_nl_cmg_region_read_chunk_fill function. A future change to implement support for directly reading from a region without a snapshot will provide a separate implementation that calls the newly added devlink region operation. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30devlink: remove unnecessary parameter from chunk_fill functionJacob Keller
The devlink parameter of the devlink_nl_cmd_region_read_chunk_fill function is not used. Remove it, to simplify the function signature. Once removed, it is also obvious that the devlink parameter is not necessary for the devlink_nl_region_read_snapshot_fill either. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30devlink: find snapshot in devlink_nl_cmd_region_read_dumpitJacob Keller
The snapshot pointer is obtained inside of the function devlink_nl_region_read_snapshot_fill. Simplify this function by locating the snapshot upfront in devlink_nl_cmd_region_read_dumpit instead. This aligns with how other netlink attributes are handled, and allows us to exit slightly earlier if an invalid snapshot ID is provided. It also allows us to pass the snapshot pointer directly to the devlink_nl_region_read_snapshot_fill, and remove the now unused attrs parameter. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30devlink: report extended error message in region_read_dumpit()Jacob Keller
Report extended error details in the devlink_nl_cmd_region_read_dumpit() function, by using the extack structure from the netlink_callback. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30devlink: use min_t to calculate data_sizeJacob Keller
The calculation for the data_size in the devlink_nl_read_snapshot_fill function uses an if statement that is better expressed using the min_t macro. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30net: microchip: vcap: Change how the rule id is generatedHoratiu Vultur
Currently whenever a new rule id is generated, it picks up the next number bigger than previous id. So it would always be 1, 2, 3, etc. When the rule with id 1 will be deleted and a new rule will be added, it will have the id 4 and not id 1. In theory this can be a problem if at some point a rule will be added and removed ~0 times. Then no more rules can be added because there are no more ids. Change this such that when a new rule is added, search for an empty rule id starting with value of 1 as value 0 is reserved. Fixes: c9da1ac1c212 ("net: microchip: sparx5: Adding initial tc flower support for VCAP API") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Link: https://lore.kernel.org/r/20221128142959.8325-1-horatiu.vultur@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30octeontx2-af: Simplify a size computation in rvu_npc_exact_init()Christophe JAILLET
We know that table_size = table->mem_table.depth * table->mem_table.ways, so use it instead, it is less verbose. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/5230dabe27f48948a9fd0f50a62e2437b65e6a6e.1669378798.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30octeontx2-af: Fix the size of memory allocated for the 'id_bmap' bitmapChristophe JAILLET
This allocation is really spurious. The size of the bitmap is 'tot_ids' and it is used as such in the driver. So we could expect something like: table->id_bmap = devm_kcalloc(rvu->dev, BITS_TO_LONGS(table->tot_ids), sizeof(long), GFP_KERNEL); However, when the bitmap is allocated, we allocate: BITS_TO_LONGS(table->tot_ids) * table->tot_ids ~= table->tot_ids / 32 * table->tot_ids ~= table->tot_ids^2 / 32 It is proportional to the square of 'table->tot_ids' which seems to potentially be big. Allocate the expected amount of memory, and switch to the bitmap API to have it more straightforward. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/ce2710771939065d68f95d86a27cf7cea7966365.1669378798.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-30octeontx2-af: Use the bitmap API to allocate bitmapsChristophe JAILLET
Use devm_bitmap_zalloc() instead of hand-writing it. This also makes the comment "Allocate bitmap for 32 entry mcam" more explicit because now 32 is really used in the allocation function, instead of an obscure 'sizeof(long)'. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/24177a9ee7043259448b735263d9cfd6a70e89a4.1669378798.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>