git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2021-11-17	ath11k: add trace log support	Venkateswara Naralasetty
	This change is to add trace log support for, * WMI events * WMI commands * ath11k_dbg messages * ath11k_dbg_dump messages * ath11k_log_info messages * ath11k_log_warn messages * ath11k_log_err messages Tested-on: IPQ8074 hw2.0 AHB WLAN.HK.2.5.0.1-00652-QCAHKSWPL_SILICONZ-1 Signed-off-by: Venkateswara Naralasetty <quic_vnaralas@quicinc.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/1636439755-30419-1-git-send-email-quic_vnaralas@quicinc.com
2021-11-17	ath11k: Add missing qmi_txn_cancel()	Anilkumar Kolli
	Currently many functions do not follow this guidance when qmi_send_request() fails, therefore add missing qmi_txn_cancel() in the qmi_send_request() error path. Also remove initialization on 'struct qmi_txn' since qmi_tx_init() performs all necessary initialization. Tested-on: QCN9074 hw1.0 PCI WLAN.HK.2.4.0.1-01838-QCAHKSWPL_SILICONZ-1 Signed-off-by: Anilkumar Kolli <akolli@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/1635857558-21733-1-git-send-email-akolli@codeaurora.org
2021-11-17	ath11k: use cache line aligned buffers for dbring	Rameshkumar Sundaram
	The DMA buffers of dbring which is used for spectral/cfr starts at certain offset from original kmalloc() returned buffer. This is not cache line aligned. And also driver tries to access the data that is immediately before this offset address (i.e. buff->paddr) after doing dma map. This will cause cache line sharing issues and data corruption, if CPU happen to write back cache after HW has dma'ed the data. Fix this by mapping a cache line aligned buffer to dma. Tested on: IPQ8074 hw2.0 AHB WLAN.HK.2.5.0.1-01100-QCAHKSWPL_SILICONZ-1 Signed-off-by: Rameshkumar Sundaram <quic_ramess@quicinc.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/1635831693-15962-1-git-send-email-quic_ramess@quicinc.com
2021-11-17	ath11k: Disabling credit flow for WMI path	P Praneesh
	Firmware credit flow control is enabled for WMI control services, which expects available tokens should be acquired before sending a command to the target. Also the token gets released when firmware receives the command. This credit-based flow limits driver to send WMI command only when the token available which is causing WMI commands to timeout and return -EAGAIN, whereas firmware has enough capability to process the WMI command. To fix this Tx starvation issue, introduce the ability to disable the credit flow for the WMI path. The driver sends WMI configuration for disabling credit flow to firmware by two ways. 1. By using a global flag (HTC_MSG_SETUP_COMPLETE_EX_ID msg type flags) 2. By using a local flag (ATH11K_HTC_CONN_FLAGS_DISABLE_CREDIT_FLOW_CTRL = 1 << 3) Ath11k uses both these configurations to disable credit flow for the WMI path completely. Also added a hw_param member for credit flow control by which we can enable or disable it based on per-target basis. Currently we are disabling credit flow for IPQ8074, IPQ6018, and QCN9074 as recommended by firmware. Tested-on: IPQ8074 hw2.0 AHB WLAN.HK.2.4.0.1-01492-QCAHKSWPL_SILICONZ-1 Tested-on: IPQ6018 hw1.0 AHB WLAN.HK.2.4.0.1-00330-QCAHKSWPL_SILICONZ-1 Co-developed-by: Pravas Kumar Panda <kumarpan@codeaurora.org> Signed-off-by: Pravas Kumar Panda <kumarpan@codeaurora.org> Signed-off-by: P Praneesh <quic_ppranees@quicinc.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/1635156494-20059-1-git-send-email-quic_ppranees@quicinc.com
2021-11-17	ath11k: Fix ETSI regd with weather radar overlap	Sven Eckelmann
	Some ETSI countries have a small overlap in the wireless-regdb with an ETSI channel (5590-5650). A good example is Australia: country AU: DFS-ETSI (2400 - 2483.5 @ 40), (36) (5150 - 5250 @ 80), (23), NO-OUTDOOR, AUTO-BW (5250 - 5350 @ 80), (20), NO-OUTDOOR, AUTO-BW, DFS (5470 - 5600 @ 80), (27), DFS (5650 - 5730 @ 80), (27), DFS (5730 - 5850 @ 80), (36) (57000 - 66000 @ 2160), (43), NO-OUTDOOR If the firmware (or the BDF) is shipped with these rules then there is only a 10 MHz overlap with the weather radar: * below: 5470 - 5590 * weather radar: 5590 - 5600 * above: (none for the rule "5470 - 5600 @ 80") There are several wrong assumption in the ath11k code: * there is always a valid range below the weather radar (actually: there could be no range below the weather radar range OR range could be smaller than 20 MHz) * intersected range in the weather radar range is valid (actually: the range could be smaller than 20 MHz) * range above weather radar is either empty or valid (actually: the range could be smaller than 20 MHz) These wrong assumption will lead in this example to a rule (5590 - 5600 @ 20), (N/A, 27), (600000 ms), DFS, AUTO-BW which is invalid according to is_valid_reg_rule() because the freq_diff is only 10 MHz but the max_bandwidth is set to 20 MHz. Which results in a rejection like: WARNING: at backports-20210222_001-4.4.60-b157d2276/net/wireless/reg.c:3984 [...] Call trace: [<ffffffbffc3d2e50>] reg_get_max_bandwidth+0x300/0x3a8 [cfg80211] [<ffffffbffc3d3d0c>] regulatory_set_wiphy_regd_sync+0x3c/0x98 [cfg80211] [<ffffffbffc651598>] ath11k_regd_update+0x1a8/0x210 [ath11k] [<ffffffbffc652108>] ath11k_regd_update_work+0x18/0x20 [ath11k] [<ffffffc0000a93e0>] process_one_work+0x1f8/0x340 [<ffffffc0000a9784>] worker_thread+0x25c/0x448 [<ffffffc0000aedc8>] kthread+0xd0/0xd8 [<ffffffc000085550>] ret_from_fork+0x10/0x40 ath11k c000000.wifi: failed to perform regd update : -22 Invalid regulatory domain detected To avoid this, the algorithm has to be changed slightly. Instead of splitting a rule which overlaps with the weather radar range into 3 pieces and accepting the first two parts blindly, it must actually be checked for each piece whether it is a valid range. And only if it is valid, add it to the output array. When these checks are in place, the processed rules for AU would end up as country AU: DFS-ETSI (2400 - 2483 @ 40), (N/A, 36), (N/A) (5150 - 5250 @ 80), (6, 23), (N/A), NO-OUTDOOR, AUTO-BW (5250 - 5350 @ 80), (6, 20), (0 ms), NO-OUTDOOR, DFS, AUTO-BW (5470 - 5590 @ 80), (6, 27), (0 ms), DFS, AUTO-BW (5650 - 5730 @ 80), (6, 27), (0 ms), DFS, AUTO-BW (5730 - 5850 @ 80), (6, 36), (N/A), AUTO-BW and will be accepted by the wireless regulatory code. Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20211112153116.1214421-1-sven@narfation.org
2021-11-16	net/mlx5: E-switch, Create QoS on demand	Dmytro Linkin
	Don't create eswitch QoS (root TSAR) on switch mode change. Create it on first child TSAR object creation - vport or rate group. Keep track root TSAR references and release root TSAR with last object deletion. No need to check for QoS is enabled when installing tc matchall filter. Remove related helper function due to no users of it. Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16	net/mlx5: E-switch, Enable vport QoS on demand	Dmytro Linkin
	Vports' QoS is not commonly used but consume SW/HW resources, which becomes an issue on BlueField SoC systems. Don't enable QoS on vports by default on eswitch mode change and enable when it's going to be used by one of the top level users: - configuring TC matchall filter with police action; - setting rate with legacy NDO API; - calling devlink ops->rate_leaf_*() callbacks. Disable vport QoS on vport cleanup. Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16	net/mlx5: E-switch, move offloads mode callbacks to offloads file	Parav Pandit
	eswitch.c is mainly for common code between legacy and offloads mode. MAC address get and set via devlink is applicable only in offloads mode. Hence, move it to eswitch_offloads.c file. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16	net/mlx5: E-switch, Reuse mlx5_eswitch_set_vport_mac	Parav Pandit
	mlx5_eswitch_set_vport_mac() routine already does necessary checks which are duplicated in implementation of mlx5_devlink_port_function_hw_addr_set(). Hence, reuse mlx5_eswitch_set_vport_mac() and cut down the code. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16	net/mlx5: E-switch, Remove vport enabled check	Parav Pandit
	An eswitch vport of the devlink port is always enabled before a devlink port is registered. And a eswitch vport is always disabled after a devlink port is unregistered. Hence avoid the vport enabled check in the devlink callback routine. Such check is only applicable in the legacy SR-IOV callbacks. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Sunil Sudhakar Rani <sunrani@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16	net/mlx5e: Specify out ifindex when looking up decap route	Chris Mi
	There is a use case that the local and remote VTEPs are in the same host. Currently, the out ifindex is not specified when looking up the decap route for offloads. So in this case, a local route is returned and the route dev is lo. Actual tunnel interface can be created with a parameter "dev" [1], which specifies the physical device to use for tunnel endpoint communication. Pass this parameter to driver when looking up decap route for offloads. So that a unicast route will be returned. [1] ip link add name vxlan1 type vxlan id 100 dev enp4s0f0 remote 1.1.1.1 dstport 4789 Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16	net/mlx5e: TC, Move comment about mod header flag to correct place	Roi Dayan
	Move the comment to the correct place where the driver actually removes the flag and not in the check that maybe pedit actions exists. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16	net/mlx5e: TC, Move kfree() calls after destroying all resources	Roi Dayan
	When deleting fdb/nic flow rules first release all resources and then call the kfree() calls instead of sparse them around the function. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16	net/mlx5e: TC, Destroy nic flow counter if exists	Roi Dayan
	Counter is only added if counter flag exists. So check the counter fag exists for deleting the counter. This is the same as in add/del fdb flow. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16	net/mlx5: TC, using swap() instead of tmp variable	Yihao Han
	swap() was used instead of the tmp variable to swap values Signed-off-by: Yihao Han <hanyihao@vivo.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16	net/mlx5: CT: Allow static allocation of mod headers	Paul Blakey
	As each CT rule uses at least 4 modify header actions, each rule causes at least 3 reallocations by the mod header actions api. Allow initial static allocation of the mod acts array, and use it for CT rules. If the static allocation is exceeded go back to dynamic allocation. Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com>
2021-11-16	net/mlx5e: Refactor mod header management API	Paul Blakey
	For all mod hdr related functions to reside in a single self contained component (mod_hdr.c), refactor alloc() and add get_id() so that user won't rely on internal implementation, and move both to mod_hdr component. Rename the prefix to mlx5e_mod_hdr_* as other mod hdr functions. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16	net/mlx5: Avoid printing health buffer when firmware is unavailable	Aya Levin
	Use firmware version field as an indication to health buffer's sanity. When firmware version is 0xFFFFFFFF, deduce that firmware is unavailable and avoid printing the health buffer to dmesg as it doesn't provide debug info. Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16	net/mlx5: Fix format-security build warnings	Saeed Mahameed
	Treat the string as an argument to avoid this. drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c:482:5: error: format string is not a string literal (potentially insecure) name); ^~~~ drivers/net/ethernet/mellanox/mlx5/core/en_stats.c:2079:4: error: format string is not a string literal (potentially insecure) ptp_ch_stats_desc[i].format); ^~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
2021-11-16	net/mlx5e: Support ethtool cq mode	Saeed Mahameed
	Add support for ethtool coalesce cq mode set and get. Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
2021-11-16	net: stmmac: Fix signed/unsigned wreckage	Thomas Gleixner
	The recent addition of timestamp correction to compensate the CDC error introduced a subtle signed/unsigned bug in stmmac_get_tx_hwtstamp() while it managed for some obscure reason to avoid that in stmmac_get_rx_hwtstamp(). The issue is: s64 adjust = 0; u64 ns; adjust += -(2 * (NSEC_PER_SEC / priv->plat->clk_ptp_rate)); ns += adjust; works by chance on 64bit, but falls apart on 32bit because the compiler knows that adjust fits into 32bit and then treats the addition as a u64 + u32 resulting in an off by ~2 seconds failure. The RX variant uses an u64 for adjust and does the adjustment via ns -= adjust; because consistency is obviously overrated. Get rid of the pointless zero initialized adjust variable and do: ns -= (2 * NSEC_PER_SEC) / priv->plat->clk_ptp_rate; which is obviously correct and spares the adjust obfuscation. Aside of that it yields a more accurate result because the multiplication takes place before the integer divide truncation and not afterwards. Stick the calculation into an inline so it can't be accidentally disimproved. Return an u32 from that inline as the result is guaranteed to fit which lets the compiler optimize the substraction. Cc: stable@vger.kernel.org Fixes: 3600be5f58c1 ("net: stmmac: add timestamp correction to rid CDC sync error") Reported-by: Benedikt Spranger <b.spranger@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Benedikt Spranger <b.spranger@linutronix.de> Tested-by: Kurt Kanzenbach <kurt@linutronix.de> # Intel EHL Link: https://lore.kernel.org/r/87mtm578cs.ffs@tglx Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16	net: document SMII and correct phylink's new validation mechanism	Russell King (Oracle)
	SMII has not been documented in the kernel, but information on this PHY interface mode has been recently found. Document it, and correct the recently introduced phylink handling for this interface mode. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/E1mmfVl-0075nP-14@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16	Merge branch 'net-fix-the-mirred-packet-drop-due-to-the-incorrect-dst'	Jakub Kicinski
	Xin Long says: ==================== net: fix the mirred packet drop due to the incorrect dst This issue was found when using OVS HWOL on OVN-k8s. These packets dropped on rx path were seen with output dst, which should've been dropped from the skbs when redirecting them. The 1st patch is to the fix and the 2nd is a selftest to reproduce and verify it. ==================== Link: https://lore.kernel.org/r/cover.1636734751.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16	selftests: add a test case for mirred egress to ingress	Davide Caratti
	add a selftest that verifies the correct behavior of TC act_mirred egress to ingress: in particular, it checks if the dst_entry is removed from skb before redirect egress -> ingress. The correct behavior is: an ICMP 'echo request' generated by ping will be received and generate a reply the same way as the one generated by mausezahn. Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Cong Wang <cong.wang@bytedance.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16	net: sched: act_mirred: drop dst for the direction from egress to ingress	Xin Long
	Without dropping dst, the packets sent from local mirred/redirected to ingress will may still use the old dst. ip_rcv() will drop it as the old dst is for output and its .input is dst_discard. This patch is to fix by also dropping dst for those packets that are mirred or redirected from egress to ingress in act_mirred. Note that we don't drop it for the direction change from ingress to egress, as on which there might be a user case attaching a metadata dst by act_tunnel_key that would be used later. Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Cong Wang <cong.wang@bytedance.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16	amt: cancel delayed_work synchronously in amt_fini()	Taehee Yoo
	When the amt module is being removed, it calls cancel_delayed_work() to cancel pending delayed_work. But this function doesn't wait for canceling delayed_work. So, workers can be still doing after module delete. In order to avoid this, cancel_delayed_work_sync() should be used instead. Suggested-by: Jakub Kicinski <kuba@kernel.org> Fixes: bc54e49c140b ("amt: add multicast(IGMP) report message handler") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Link: https://lore.kernel.org/r/20211116160923.25258-1-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16	Merge branch ↵	Jakub Kicinski
	'r8169-disable-detection-of-further-chip-versions-that-didn-t-make-it-to-the-mass-market' Heiner Kallweit says: ==================== r8169: disable detection of further chip versions that didn't make it to the mass market There's no sign of life from further chip versions. Seems they didn't make it to the mass market. Let's disable detection and if nobody complains remove support a few kernel versions later. ==================== Link: https://lore.kernel.org/r/7708d13a-4a2b-090d-fadf-ecdd0fff5d2e@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16	r8169: disable detection of chip version 41	Heiner Kallweit
	It seems this chip version never made it to the wild. Therefore disable detection and if nobody complains remove support completely later. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16	r8169: disable detection of chip version 45	Heiner Kallweit
	It seems this chip version never made it to the wild. Therefore disable detection and if nobody complains remove support completely later. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16	r8169: disable detection of chip versions 49 and 50	Heiner Kallweit
	It seems these chip versions never made it to the wild. Therefore disable detection and if nobody complains remove support completely later. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16	r8169: enable ASPM L1/L1.1 from RTL8168h	Heiner Kallweit
	With newer chip versions ASPM-related issues seem to occur only if L1.2 is enabled. I have a test system with RTL8168h that gives a number of rx_missed errors when running iperf and L1.2 is enabled. With L1.2 disabled (and L1 + L1.1 active) everything is fine. See also [0]. Can't test this, but L1 + L1.1 being active should be sufficient to reach higher package power saving states. [0] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1942830 Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/36feb8c4-a0b6-422a-899c-e61f2e869dfe@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16	Merge branch 'net-better-packing-of-global-vars'	Jakub Kicinski
	Eric Dumazet says: ==================== net: better packing of global vars First two patches avoid holes in data section, and last patch makes sure some siphash keys are contained in a single cache line. ==================== Link: https://lore.kernel.org/r/20211115172303.3732746-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16	net: align static siphash keys	Eric Dumazet
	siphash keys use 16 bytes. Define siphash_aligned_key_t macro so that we can make sure they are not crossing a cache line boundary. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16	net: use .data.once section in netdev_level_once()	Eric Dumazet
	Same rationale than prior patch : using the dedicated section avoid holes and pack all these bool values. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16	once: use __section(".data.once")	Eric Dumazet
	.data.once contains nicely packed bool variables. It is used already by DO_ONCE_LITE(). Using it also in DO_ONCE() removes holes in .data section. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16	MAINTAINERS: remove GR-everest-linux-l2@marvell.com	Pavel Skripkin
	I've sent a patch to GR-everest-linux-l2@marvell.com few days ago and got a reply from postmaster@marvell.com: Delivery has failed to these recipients or groups: gr-everest-linux-l2@marvell.com<mailto:gr-everest-linux-l2@marvell.com> The email address you entered couldn't be found. Please check the recipient's email address and try to resend the message. If the problem continues, please contact your helpdesk. As requested by Alok Prasad, replacing GR-everest-linux-l2@marvell.com with Manish Chopra's email address. [0] Link: https://lore.kernel.org/all/20211116081601.11208-1-palok@marvell.com/ [0] Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Link: https://lore.kernel.org/r/20211116141303.32180-1-paskripkin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16	bnxt_en: Fix compile error regression when CONFIG_BNXT_SRIOV is not set	Michael Chan
	bp->sriov_cfg is not defined when CONFIG_BNXT_SRIOV is not set. Fix it by adding a helper function bnxt_sriov_cfg() to handle the logic with or without the config option. Fixes: 46d08f55d24e ("bnxt_en: extend RTNL to VF check in devlink driver_reinit") Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/1637090770-22835-1-git-send-email-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16	net: mvmdio: fix compilation warning	Marcin Wojtas
	The kernel test robot reported a following issue: >> drivers/net/ethernet/marvell/mvmdio.c:426:36: warning: unused variable 'orion_mdio_acpi_match' [-Wunused-const-variable] static const struct acpi_device_id orion_mdio_acpi_match[] = { ^ 1 warning generated. Fix that by surrounding the variable by appropriate ifdef. Fixes: c54da4c1acb1 ("net: mvmdio: add ACPI support") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Marcin Wojtas <mw@semihalf.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20211115153024.209083-1-mw@semihalf.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16	Merge tag 'mac80211-for-net-2021-11-16' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Couple of fixes: * bad dont-reorder check * throughput LED trigger for various new(ish) paths * radiotap header generation * locking assertions in mac80211 with monitor mode * radio statistics * don't try to access IV when not present * call stop_ap for P2P_GO as well as we should * tag 'mac80211-for-net-2021-11-16' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211: mac80211: fix throughput LED trigger mac80211: fix monitor_sdata RCU/locking assertions mac80211: drop check for DONT_REORDER in __ieee80211_select_queue mac80211: fix radiotap header generation mac80211: do not access the IV when it was stripped nl80211: fix radio statistics in survey dump cfg80211: call cfg80211_stop_ap when switch from P2P_GO type ==================== Link: https://lore.kernel.org/r/20211116160845.157214-1-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf	Jakub Kicinski
	Daniel Borkmann says: ==================== pull-request: bpf 2021-11-16 We've added 12 non-merge commits during the last 5 day(s) which contain a total of 23 files changed, 573 insertions(+), 73 deletions(-). The main changes are: 1) Fix pruning regression where verifier went overly conservative rejecting previsouly accepted programs, from Alexei Starovoitov and Lorenz Bauer. 2) Fix verifier TOCTOU bug when using read-only map's values as constant scalars during verification, from Daniel Borkmann. 3) Fix a crash due to a double free in XSK's buffer pool, from Magnus Karlsson. 4) Fix libbpf regression when cross-building runqslower, from Jean-Philippe Brucker. 5) Forbid use of bpf_ktime_get_coarse_ns() and bpf_timer_() helpers in tracing programs due to deadlock possibilities, from Dmitrii Banshchikov. 6) Fix checksum validation in sockmap's udp_read_sock() callback, from Cong Wang. 7) Various BPF sample fixes such as XDP stats in xdp_sample_user, from Alexander Lobakin. 8) Fix libbpf gen_loader error handling wrt fd cleanup, from Kumar Kartikeya Dwivedi. https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: udp: Validate checksum in udp_read_sock() bpf: Fix toctou on read-only map's constant scalar tracking samples/bpf: Fix build error due to -isystem removal selftests/bpf: Add tests for restricted helpers bpf: Forbid bpf_ktime_get_coarse_ns and bpf_timer_* in tracing progs libbpf: Perform map fd cleanup for gen_loader in case of error samples/bpf: Fix incorrect use of strlen in xdp_redirect_cpu tools/runqslower: Fix cross-build samples/bpf: Fix summary per-sec stats in xdp_sample_user selftests/bpf: Check map in map pruning bpf: Fix inner map state pruning regression. xsk: Fix crash on double free in buffer pool ==================== Link: https://lore.kernel.org/r/20211116141134.6490-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-16	net/mlx5: E-Switch, return error if encap isn't supported	Raed Salem
	On regular ConnectX HCAs getting encap mode isn't supported when the E-Switch is in NONE mode. Current code would return no error code when trying to get encap mode in such case which is wrong. Fix by returning error value to indicate failure to caller in such case. Fixes: 8e0aa4bc959c ("net/mlx5: E-switch, Protect eswitch mode changes") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16	net/mlx5: Lag, update tracker when state change event received	Maher Sanalla
	Currently, In NETDEV_CHANGELOWERSTATE/NETDEV_CHANGEUPPERSTATE events handling, tracking is not fully completed if the LAG device is not ready at the time the events occur. But, we must keep track of the upper and lower states after receiving the events because RoCE needs this info in mlx5_lag_get_roce_netdev() - in order to return the corresponding port that its running on. Returning the wrong (not most recent) port will lead to gids table being incorrect. For example: If during the attachment of a slave to the bond, the other non-attached port performs pci_reload, then the LAG device is not ready, but that should not result in dismissing attached slave tracker update automatically (which is performed in mlx5_handle_changelowerstate()), Since these events might not come later, which can lead to both bond ports having tx_enabled=0 - which is not a valid state of LAG bond. Fixes: 9b412cc35f00 ("net/mlx5e: Add LAG warning if bond slave is not lag master") Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16	net/mlx5e: CT, Fix multiple allocations and memleak of mod acts	Roi Dayan
	CT clear action offload adds additional mod hdr actions to the flow's original mod actions in order to clear the registers which hold ct_state. When such flow also includes encap action, a neigh update event can cause the driver to unoffload the flow and then reoffload it. Each time this happens, the ct clear handling adds that same set of mod hdr actions to reset ct_state until the max of mod hdr actions is reached. Also the driver never releases the allocated mod hdr actions and causing a memleak. Fix above two issues by moving CT clear mod acts allocation into the parsing actions phase and only use it when offloading the rule. The release of mod acts will be done in the normal flow_put(). backtrace: [<000000007316e2f3>] krealloc+0x83/0xd0 [<00000000ef157de1>] mlx5e_mod_hdr_alloc+0x147/0x300 [mlx5_core] [<00000000970ce4ae>] mlx5e_tc_match_to_reg_set_and_get_id+0xd7/0x240 [mlx5_core] [<0000000067c5fa17>] mlx5e_tc_match_to_reg_set+0xa/0x20 [mlx5_core] [<00000000d032eb98>] mlx5_tc_ct_entry_set_registers.isra.0+0x36/0xc0 [mlx5_core] [<00000000fd23b869>] mlx5_tc_ct_flow_offload+0x272/0x1f10 [mlx5_core] [<000000004fc24acc>] mlx5e_tc_offload_fdb_rules.part.0+0x150/0x620 [mlx5_core] [<00000000dc741c17>] mlx5e_tc_encap_flows_add+0x489/0x690 [mlx5_core] [<00000000e92e49d7>] mlx5e_rep_update_flows+0x6e4/0x9b0 [mlx5_core] [<00000000f60f5602>] mlx5e_rep_neigh_update+0x39a/0x5d0 [mlx5_core] Fixes: 1ef3018f5af3 ("net/mlx5e: CT: Support clear action") Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16	net/mlx5: Fix flow counters SF bulk query len	Avihai Horon
	When doing a flow counters bulk query, the number of counters to query must be aligned to 4. Current SF bulk query len is not aligned to 4, which leads to an error when trying to query more than 4 counters. Fix it by aligning SF bulk query len to 4. Fixes: 2fdeb4f4c2ae ("net/mlx5: Reduce flow counters bulk query buffer size for SFs") Signed-off-by: Avihai Horon <avihaih@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16	net/mlx5: E-Switch, rebuild lag only when needed	Mark Bloch
	A user can enable VFs without changing E-Switch mode, this can happen when a user moves straight to switchdev mode and only once in switchdev VFs are enabled via the sysfs interface. The cited commit assumed this isn't possible and exposed a single API function where the E-switch calls into the lag code, breaks the lag and prevents any other lag operations to take place until the E-switch update has ended. Breaking the hardware lag when it isn't needed can make it such that hardware lag can't be enabled again. In the sysfs call path check if the current E-Switch mode is NONE, in the context of the function it can only mean the E-Switch is moving out of NONE mode and the hardware lag should be disabled and enabled once the mode change has ended. If the mode isn't NONE it means VFs are about to be enabled and such operation doesn't require toggling the hardware lag. Fixes: cac1eb2cf2e3 ("net/mlx5: Lag, properly lock eswitch if needed") Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16	net/mlx5: Update error handler for UCTX and UMEM	Neta Ostrovsky
	In the fast unload flow, the device state is set to internal error, which indicates that the driver started the destroy process. In this case, when a destroy command is being executed, it should return MLX5_CMD_STAT_OK. Fix MLX5_CMD_OP_DESTROY_UCTX and MLX5_CMD_OP_DESTROY_UMEM to return OK instead of EIO. This fixes a call trace in the umem release process - [ 2633.536695] Call Trace: [ 2633.537518] ib_uverbs_remove_one+0xc3/0x140 [ib_uverbs] [ 2633.538596] remove_client_context+0x8b/0xd0 [ib_core] [ 2633.539641] disable_device+0x8c/0x130 [ib_core] [ 2633.540615] __ib_unregister_device+0x35/0xa0 [ib_core] [ 2633.541640] ib_unregister_device+0x21/0x30 [ib_core] [ 2633.542663] __mlx5_ib_remove+0x38/0x90 [mlx5_ib] [ 2633.543640] auxiliary_bus_remove+0x1e/0x30 [auxiliary] [ 2633.544661] device_release_driver_internal+0x103/0x1f0 [ 2633.545679] bus_remove_device+0xf7/0x170 [ 2633.546640] device_del+0x181/0x410 [ 2633.547606] mlx5_rescan_drivers_locked.part.10+0x63/0x160 [mlx5_core] [ 2633.548777] mlx5_unregister_device+0x27/0x40 [mlx5_core] [ 2633.549841] mlx5_uninit_one+0x21/0xc0 [mlx5_core] [ 2633.550864] remove_one+0x69/0xe0 [mlx5_core] [ 2633.551819] pci_device_remove+0x3b/0xc0 [ 2633.552731] device_release_driver_internal+0x103/0x1f0 [ 2633.553746] unbind_store+0xf6/0x130 [ 2633.554657] kernfs_fop_write+0x116/0x190 [ 2633.555567] vfs_write+0xa5/0x1a0 [ 2633.556407] ksys_write+0x4f/0xb0 [ 2633.557233] do_syscall_64+0x5b/0x1a0 [ 2633.558071] entry_SYSCALL_64_after_hwframe+0x65/0xca [ 2633.559018] RIP: 0033:0x7f9977132648 [ 2633.559821] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 55 6f 2d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55 [ 2633.562332] RSP: 002b:00007fffb1a83888 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 2633.563472] RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007f9977132648 [ 2633.564541] RDX: 000000000000000c RSI: 000055b90546e230 RDI: 0000000000000001 [ 2633.565596] RBP: 000055b90546e230 R08: 00007f9977406860 R09: 00007f9977a54740 [ 2633.566653] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f99774056e0 [ 2633.567692] R13: 000000000000000c R14: 00007f9977400880 R15: 000000000000000c [ 2633.568725] ---[ end trace 10b4fe52945e544d ]--- Fixes: 6a6fabbfa3e8 ("net/mlx5: Update pci error handler entries and command translation") Signed-off-by: Neta Ostrovsky <netao@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16	net/mlx5: DR, Fix check for unsupported fields in match param	Yevgeny Kliteynik
	The existing loop doesn't cast the buffer while scanning it, which results in out-of-bounds read and failure to create the matcher. Fixes: 941f19798a11 ("net/mlx5: DR, Add check for unsupported fields in match param") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16	net/mlx5: DR, Handle eswitch manager and uplink vports separately	Yevgeny Kliteynik
	When querying eswitch manager vport capabilities as "other = 1", we encounter a FW compatibility issue with older FW versions. To maintain backward compatibility, eswitch manager vport should be queried as "other = 0" vport both for ECPF and non-ECPF cases. This patch fixes these queries and improves the code readability by handling eswitch manager and uplink vports separately, avoiding the excessive 'if' conditions. Also, uplink caps are stored similar to esw manager and not as part of xarray. Fixes: dd4acb2a0954 ("net/mlx5: DR, Add missing query for vport 0") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16	net/mlx5e: nullify cq->dbg pointer in mlx5_debug_cq_remove()	Valentine Fatiev
	Prior to this patch in case mlx5_core_destroy_cq() failed it proceeds to rest of destroy operations. mlx5_core_destroy_cq() could be called again by user and cause additional call of mlx5_debug_cq_remove(). cq->dbg was not nullify in previous call and cause the crash. Fix it by nullify cq->dbg pointer after removal. Also proceed to destroy operations only if FW return 0 for MLX5_CMD_OP_DESTROY_CQ command. general protection fault, probably for non-canonical address 0x2000300004058: 0000 [#1] SMP PTI CPU: 5 PID: 1228 Comm: python Not tainted 5.15.0-rc5_for_upstream_min_debug_2021_10_14_11_06 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:lockref_get+0x1/0x60 Code: 5d e9 53 ff ff ff 48 8d 7f 70 e8 0a 2e 48 00 c7 85 d0 00 00 00 02 00 00 00 c6 45 70 00 fb 5d c3 c3 cc cc cc cc cc cc cc cc 53 <48> 8b 17 48 89 fb 85 d2 75 3d 48 89 d0 bf 64 00 00 00 48 89 c1 48 RSP: 0018:ffff888137dd7a38 EFLAGS: 00010206 RAX: 0000000000000000 RBX: ffff888107d5f458 RCX: 00000000fffffffe RDX: 000000000002c2b0 RSI: ffffffff8155e2e0 RDI: 0002000300004058 RBP: ffff888137dd7a88 R08: 0002000300004058 R09: ffff8881144a9f88 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8881141d4000 R13: ffff888137dd7c68 R14: ffff888137dd7d58 R15: ffff888137dd7cc0 FS: 00007f4644f2a4c0(0000) GS:ffff8887a2d40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055b4500f4380 CR3: 0000000114f7a003 CR4: 0000000000170ea0 Call Trace: simple_recursive_removal+0x33/0x2e0 ? debugfs_remove+0x60/0x60 debugfs_remove+0x40/0x60 mlx5_debug_cq_remove+0x32/0x70 [mlx5_core] mlx5_core_destroy_cq+0x41/0x1d0 [mlx5_core] devx_obj_cleanup+0x151/0x330 [mlx5_ib] ? __pollwait+0xd0/0xd0 ? xas_load+0x5/0x70 ? xa_load+0x62/0xa0 destroy_hw_idr_uobject+0x20/0x80 [ib_uverbs] uverbs_destroy_uobject+0x3b/0x360 [ib_uverbs] uobj_destroy+0x54/0xa0 [ib_uverbs] ib_uverbs_cmd_verbs+0xaf2/0x1160 [ib_uverbs] ? uverbs_finalize_object+0xd0/0xd0 [ib_uverbs] ib_uverbs_ioctl+0xc4/0x1b0 [ib_uverbs] __x64_sys_ioctl+0x3e4/0x8e0 Fixes: 94b960b9deff ("net/mlx5e: Fix memory leak in mlx5_core_destroy_cq() error path") Signed-off-by: Valentine Fatiev <valentinef@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-11-16	net/mlx5: E-Switch, Fix resetting of encap mode when entering switchdev	Paul Blakey
	E-Switch encap mode is relevant only when in switchdev mode. The RDMA driver can query the encap configuration via mlx5_eswitch_get_encap_mode(). Make sure it returns the currently used mode and not the set one. This reverts the cited commit which reset the encap mode on entering switchdev and fixes the original issue properly. Fixes: 9a64144d683a ("net/mlx5: E-Switch, Fix default encap mode") Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>