summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-11-04i40e: fix race condition by adding filter's intermediate sync stateAleksandr Loktionov
Fix a race condition in the i40e driver that leads to MAC/VLAN filters becoming corrupted and leaking. Address the issue that occurs under heavy load when multiple threads are concurrently modifying MAC/VLAN filters by setting mac and port VLAN. 1. Thread T0 allocates a filter in i40e_add_filter() within i40e_ndo_set_vf_port_vlan(). 2. Thread T1 concurrently frees the filter in __i40e_del_filter() within i40e_ndo_set_vf_mac(). 3. Subsequently, i40e_service_task() calls i40e_sync_vsi_filters(), which refers to the already freed filter memory, causing corruption. Reproduction steps: 1. Spawn multiple VFs. 2. Apply a concurrent heavy load by running parallel operations to change MAC addresses on the VFs and change port VLANs on the host. 3. Observe errors in dmesg: "Error I40E_AQ_RC_ENOSPC adding RX filters on VF XX, please set promiscuous on manually for VF XX". Exact code for stable reproduction Intel can't open-source now. The fix involves implementing a new intermediate filter state, I40E_FILTER_NEW_SYNC, for the time when a filter is on a tmp_add_list. These filters cannot be deleted from the hash list directly but must be removed using the full process. Fixes: 278e7d0b9d68 ("i40e: store MAC/VLAN filters in a hash with the MAC Address as key") Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Reviewed-by: Michal Schmidt <mschmidt@redhat.com> Tested-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-11-04idpf: fix idpf_vc_core_init error pathPavan Kumar Linga
In an event where the platform running the device control plane is rebooted, reset is detected on the driver. It releases all the resources and waits for the reset to complete. Once the reset is done, it tries to build the resources back. At this time if the device control plane is not yet started, then the driver timeouts on the virtchnl message and retries to establish the mailbox again. In the retry flow, mailbox is deinitialized but the mailbox workqueue is still alive and polling for the mailbox message. This results in accessing the released control queue leading to null-ptr-deref. Fix it by unrolling the work queue cancellation and mailbox deinitialization in the reverse order which they got initialized. Fixes: 4930fbf419a7 ("idpf: add core init and interrupt request") Fixes: 34c21fa894a1 ("idpf: implement virtchnl transaction manager") Cc: stable@vger.kernel.org # 6.9+ Reviewed-by: Tarun K Singh <tarun.k.singh@intel.com> Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-11-04idpf: avoid vport access in idpf_get_link_ksettingsPavan Kumar Linga
When the device control plane is removed or the platform running device control plane is rebooted, a reset is detected on the driver. On driver reset, it releases the resources and waits for the reset to complete. If the reset fails, it takes the error path and releases the vport lock. At this time if the monitoring tools tries to access link settings, it call traces for accessing released vport pointer. To avoid it, move link_speed_mbps to netdev_priv structure which removes the dependency on vport pointer and the vport lock in idpf_get_link_ksettings. Also use netif_carrier_ok() to check the link status and adjust the offsetof to use link_up instead of link_speed_mbps. Fixes: 02cbfba1add5 ("idpf: add ethtool callbacks") Cc: stable@vger.kernel.org # 6.7+ Reviewed-by: Tarun K Singh <tarun.k.singh@intel.com> Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-11-04ice: change q_index variable type to s16 to store -1 valueMateusz Polchlopek
Fix Flow Director not allowing to re-map traffic to 0th queue when action is configured to drop (and vice versa). The current implementation of ethtool callback in the ice driver forbids change Flow Director action from 0 to -1 and from -1 to 0 with an error, e.g: # ethtool -U eth2 flow-type tcp4 src-ip 1.1.1.1 loc 1 action 0 # ethtool -U eth2 flow-type tcp4 src-ip 1.1.1.1 loc 1 action -1 rmgr: Cannot insert RX class rule: Invalid argument We set the value of `u16 q_index = 0` at the beginning of the function ice_set_fdir_input_set(). In case of "drop traffic" action (which is equal to -1 in ethtool) we store the 0 value. Later, when want to change traffic rule to redirect to queue with index 0 it returns an error caused by duplicate found. Fix this behaviour by change of the type of field `q_index` from u16 to s16 in `struct ice_fdir_fltr`. This allows to store -1 in the field in case of "drop traffic" action. What is more, change the variable type in the function ice_set_fdir_input_set() and assign at the beginning the new `#define ICE_FDIR_NO_QUEUE_IDX` which is -1. Later, if the action is set to another value (point specific queue index) the variable value is overwritten in the function. Fixes: cac2a27cd9ab ("ice: Support IPv4 Flow Director filters") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-11-04ice: Fix use after free during unload with ports in bridgeMarcin Szycik
Unloading the ice driver while switchdev port representors are added to a bridge can lead to kernel panic. Reproducer: modprobe ice devlink dev eswitch set $PF1_PCI mode switchdev ip link add $BR type bridge ip link set $BR up echo 2 > /sys/class/net/$PF1/device/sriov_numvfs sleep 2 ip link set $PF1 master $BR ip link set $VF1_PR master $BR ip link set $VF2_PR master $BR ip link set $PF1 up ip link set $VF1_PR up ip link set $VF2_PR up ip link set $VF1 up rmmod irdma ice When unloading the driver, ice_eswitch_detach() is eventually called as part of VF freeing. First, it removes a port representor from xarray, then unregister_netdev() is called (via repr->ops.rem()), finally representor is deallocated. The problem comes from the bridge doing its own deinit at the same time. unregister_netdev() triggers a notifier chain, resulting in ice_eswitch_br_port_deinit() being called. It should set repr->br_port = NULL, but this does not happen since repr has already been removed from xarray and is not found. Regardless, it finishes up deallocating br_port. At this point, repr is still not freed and an fdb event can happen, in which ice_eswitch_br_fdb_event_work() takes repr->br_port and tries to use it, which causes a panic (use after free). Note that this only happens with 2 or more port representors added to the bridge, since with only one representor port, the bridge deinit is slightly different (ice_eswitch_br_port_deinit() is called via ice_eswitch_br_ports_flush(), not ice_eswitch_br_port_unlink()). Trace: Oops: general protection fault, probably for non-canonical address 0xf129010fd1a93284: 0000 [#1] PREEMPT SMP KASAN NOPTI KASAN: maybe wild-memory-access in range [0x8948287e8d499420-0x8948287e8d499427] (...) Workqueue: ice_bridge_wq ice_eswitch_br_fdb_event_work [ice] RIP: 0010:__rht_bucket_nested+0xb4/0x180 (...) Call Trace: (...) ice_eswitch_br_fdb_find+0x3fa/0x550 [ice] ? __pfx_ice_eswitch_br_fdb_find+0x10/0x10 [ice] ice_eswitch_br_fdb_event_work+0x2de/0x1e60 [ice] ? __schedule+0xf60/0x5210 ? mutex_lock+0x91/0xe0 ? __pfx_ice_eswitch_br_fdb_event_work+0x10/0x10 [ice] ? ice_eswitch_br_update_work+0x1f4/0x310 [ice] (...) A workaround is available: brctl setageing $BR 0, which stops the bridge from adding fdb entries altogether. Change the order of operations in ice_eswitch_detach(): move the call to unregister_netdev() before removing repr from xarray. This way repr->br_port will be correctly set to NULL in ice_eswitch_br_port_deinit(), preventing a panic. Fixes: fff292b47ac1 ("ice: add VF representors one by one") Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-11-03net: xilinx: axienet: Enqueue Tx packets in dql before dmaengine startsSuraj Gupta
Enqueue packets in dql after dma engine starts causes race condition. Tx transfer starts once dma engine is started and may execute dql dequeue in completion before it gets queued. It results in following kernel crash while running iperf stress test: kernel BUG at lib/dynamic_queue_limits.c:99! <snip> Internal error: Oops - BUG: 00000000f2000800 [#1] SMP pc : dql_completed+0x238/0x248 lr : dql_completed+0x3c/0x248 Call trace: dql_completed+0x238/0x248 axienet_dma_tx_cb+0xa0/0x170 xilinx_dma_do_tasklet+0xdc/0x290 tasklet_action_common+0xf8/0x11c tasklet_action+0x30/0x3c handle_softirqs+0xf8/0x230 <snip> Start dmaengine after enqueue in dql fixes the crash. Fixes: 6a91b846af85 ("net: axienet: Introduce dmaengine support") Signed-off-by: Suraj Gupta <suraj.gupta2@amd.com> Link: https://patch.msgid.link/20241030062533.2527042-2-suraj.gupta2@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03MAINTAINERS: Remove self from DSA entryFlorian Fainelli
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Vladimir Oltean <olteanv@gmail.com> Link: https://patch.msgid.link/20241031173332.3858162-1-f.fainelli@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03net: enetc: allocate vf_state during PF probesWei Fang
In the previous implementation, vf_state is allocated memory only when VF is enabled. However, net_device_ops::ndo_set_vf_mac() may be called before VF is enabled to configure the MAC address of VF. If this is the case, enetc_pf_set_vf_mac() will access vf_state, resulting in access to a null pointer. The simplified error log is as follows. root@ls1028ardb:~# ip link set eno0 vf 1 mac 00:0c:e7:66:77:89 [ 173.543315] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000004 [ 173.637254] pc : enetc_pf_set_vf_mac+0x3c/0x80 Message from sy [ 173.641973] lr : do_setlink+0x4a8/0xec8 [ 173.732292] Call trace: [ 173.734740] enetc_pf_set_vf_mac+0x3c/0x80 [ 173.738847] __rtnl_newlink+0x530/0x89c [ 173.742692] rtnl_newlink+0x50/0x7c [ 173.746189] rtnetlink_rcv_msg+0x128/0x390 [ 173.750298] netlink_rcv_skb+0x60/0x130 [ 173.754145] rtnetlink_rcv+0x18/0x24 [ 173.757731] netlink_unicast+0x318/0x380 [ 173.761665] netlink_sendmsg+0x17c/0x3c8 Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers") Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241031060247.1290941-2-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03sctp: properly validate chunk size in sctp_sf_ootb()Xin Long
A size validation fix similar to that in Commit 50619dbf8db7 ("sctp: add size validation when walking chunks") is also required in sctp_sf_ootb() to address a crash reported by syzbot: BUG: KMSAN: uninit-value in sctp_sf_ootb+0x7f5/0xce0 net/sctp/sm_statefuns.c:3712 sctp_sf_ootb+0x7f5/0xce0 net/sctp/sm_statefuns.c:3712 sctp_do_sm+0x181/0x93d0 net/sctp/sm_sideeffect.c:1166 sctp_endpoint_bh_rcv+0xc38/0xf90 net/sctp/endpointola.c:407 sctp_inq_push+0x2ef/0x380 net/sctp/inqueue.c:88 sctp_rcv+0x3831/0x3b20 net/sctp/input.c:243 sctp4_rcv+0x42/0x50 net/sctp/protocol.c:1159 ip_protocol_deliver_rcu+0xb51/0x13d0 net/ipv4/ip_input.c:205 ip_local_deliver_finish+0x336/0x500 net/ipv4/ip_input.c:233 Reported-by: syzbot+f0cbb34d39392f2746ca@syzkaller.appspotmail.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Xin Long <lucien.xin@gmail.com> Link: https://patch.msgid.link/a29ebb6d8b9f8affd0f9abb296faafafe10c17d8.1730223981.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03net: wwan: t7xx: Fix off-by-one error in t7xx_dpmaif_rx_buf_alloc()Jinjie Ruan
The error path in t7xx_dpmaif_rx_buf_alloc(), free and unmap the already allocated and mapped skb in a loop, but the loop condition terminates when the index reaches zero, which fails to free the first allocated skb at index zero. Check with i-- so that skb at index 0 is freed as well. Cc: stable@vger.kernel.org Fixes: d642b012df70 ("net: wwan: t7xx: Add data path interface") Acked-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://patch.msgid.link/20241101025316.3234023-1-ruanjinjie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-01dt-bindings: net: xlnx,axi-ethernet: Correct phy-mode property valueSuraj Gupta
Correct phy-mode property value to 1000base-x. Fixes: cbb1ca6d5f9a ("dt-bindings: net: xlnx,axi-ethernet: convert bindings document to yaml") Signed-off-by: Suraj Gupta <suraj.gupta2@amd.com> Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20241028091214.2078726-1-suraj.gupta2@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-31net: dpaa_eth: print FD status in CPU endianness in dpaa_eth_fd tracepointVladimir Oltean
Sparse warns: note: in included file (through ../include/trace/trace_events.h, ../include/trace/define_trace.h, ../drivers/net/ethernet/freescale/dpaa/dpaa_eth_trace.h): warning: incorrect type in assignment (different base types) expected unsigned int [usertype] fd_status got restricted __be32 const [usertype] status We take struct qm_fd :: status, store it and print it as an u32, though it is a big endian field. We should print the FD status in CPU endianness for ease of debug and consistency between PowerPC and Arm systems. Though it is a not often used debug feature, it is best to treat it as a bug and backport the format change to all supported stable kernels, for consistency. Fixes: eb11ddf36eb8 ("dpaa_eth: add trace points") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Link: https://patch.msgid.link/20241029163105.44135-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-31net: enetc: set MAC address to the VF net_deviceWei Fang
The MAC address of VF can be configured through the mailbox mechanism of ENETC, but the previous implementation forgot to set the MAC address in net_device, resulting in the SMAC of the sent frames still being the old MAC address. Since the MAC address in the hardware has been changed, Rx cannot receive frames with the DMAC address as the new MAC address. The most obvious phenomenon is that after changing the MAC address, we can see that the MAC address of eno0vf0 has not changed through the "ifconfig eno0vf0" command and the IP address cannot be obtained . root@ls1028ardb:~# ifconfig eno0vf0 down root@ls1028ardb:~# ifconfig eno0vf0 hw ether 00:04:9f:3a:4d:56 up root@ls1028ardb:~# ifconfig eno0vf0 eno0vf0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 ether 66:36:2c:3b:87:76 txqueuelen 1000 (Ethernet) RX packets 794 bytes 69239 (69.2 KB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 11 bytes 2226 (2.2 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 Fixes: beb74ac878c8 ("enetc: Add vf to pf messaging support") Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Link: https://patch.msgid.link/20241029090406.841836-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-31Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfLinus Torvalds
Pull bpf fixes from Daniel Borkmann: - Fix BPF verifier to force a checkpoint when the program's jump history becomes too long (Eduard Zingerman) - Add several fixes to the BPF bits iterator addressing issues like memory leaks and overflow problems (Hou Tao) - Fix an out-of-bounds write in trie_get_next_key (Byeonguk Jeong) - Fix BPF test infra's LIVE_FRAME frame update after a page has been recycled (Toke Høiland-Jørgensen) - Fix BPF verifier and undo the 40-bytes extra stack space for bpf_fastcall patterns due to various bugs (Eduard Zingerman) - Fix a BPF sockmap race condition which could trigger a NULL pointer dereference in sock_map_link_update_prog (Cong Wang) - Fix tcp_bpf_recvmsg_parser to retrieve seq_copied from tcp_sk under the socket lock (Jiayuan Chen) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf, test_run: Fix LIVE_FRAME frame update after a page has been recycled selftests/bpf: Add three test cases for bits_iter bpf: Use __u64 to save the bits in bits iterator bpf: Check the validity of nr_words in bpf_iter_bits_new() bpf: Add bpf_mem_alloc_check_size() helper bpf: Free dynamically allocated bits in bpf_iter_bits_destroy() bpf: disallow 40-bytes extra stack for bpf_fastcall patterns selftests/bpf: Add test for trie_get_next_key() bpf: Fix out-of-bounds write in trie_get_next_key() selftests/bpf: Test with a very short loop bpf: Force checkpoint when jmp history is too long bpf: fix filed access without lock sock_map: fix a NULL pointer dereference in sock_map_link_update_prog()
2024-10-31Merge tag 'net-6.12-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from WiFi, bluetooth and netfilter. No known new regressions outstanding. Current release - regressions: - wifi: mt76: do not increase mcu skb refcount if retry is not supported Current release - new code bugs: - wifi: - rtw88: fix the RX aggregation in USB 3 mode - mac80211: fix memory corruption bug in struct ieee80211_chanctx Previous releases - regressions: - sched: - stop qdisc_tree_reduce_backlog on TC_H_ROOT - sch_api: fix xa_insert() error path in tcf_block_get_ext() - wifi: - revert "wifi: iwlwifi: remove retry loops in start" - cfg80211: clear wdev->cqm_config pointer on free - netfilter: fix potential crash in nf_send_reset6() - ip_tunnel: fix suspicious RCU usage warning in ip_tunnel_find() - bluetooth: fix null-ptr-deref in hci_read_supported_codecs - eth: mlxsw: add missing verification before pushing Tx header - eth: hns3: fixed hclge_fetch_pf_reg accesses bar space out of bounds issue Previous releases - always broken: - wifi: mac80211: do not pass a stopped vif to the driver in .get_txpower - netfilter: sanitize offset and length before calling skb_checksum() - core: - fix crash when config small gso_max_size/gso_ipv4_max_size - skip offload for NETIF_F_IPV6_CSUM if ipv6 header contains extension - mptcp: protect sched with rcu_read_lock - eth: ice: fix crash on probe for DPLL enabled E810 LOM - eth: macsec: fix use-after-free while sending the offloading packet - eth: stmmac: fix unbalanced DMA map/unmap for non-paged SKB data - eth: hns3: fix kernel crash when 1588 is sent on HIP08 devices - eth: mtk_wed: fix path of MT7988 WO firmware" * tag 'net-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (70 commits) net: hns3: fix kernel crash when 1588 is sent on HIP08 devices net: hns3: fixed hclge_fetch_pf_reg accesses bar space out of bounds issue net: hns3: initialize reset_timer before hclgevf_misc_irq_init() net: hns3: don't auto enable misc vector net: hns3: Resolved the issue that the debugfs query result is inconsistent. net: hns3: fix missing features due to dev->features configuration too early net: hns3: fixed reset failure issues caused by the incorrect reset type net: hns3: add sync command to sync io-pgtable net: hns3: default enable tx bounce buffer when smmu enabled netfilter: nft_payload: sanitize offset and length before calling skb_checksum() net: ethernet: mtk_wed: fix path of MT7988 WO firmware selftests: forwarding: Add IPv6 GRE remote change tests mlxsw: spectrum_ipip: Fix memory leak when changing remote IPv6 address mlxsw: pci: Sync Rx buffers for device mlxsw: pci: Sync Rx buffers for CPU mlxsw: spectrum_ptp: Add missing verification before pushing Tx header net: skip offload for NETIF_F_IPV6_CSUM if ipv6 header contains extension Bluetooth: hci: fix null-ptr-deref in hci_read_supported_codecs netfilter: nf_reject_ipv6: fix potential crash in nf_send_reset6() netfilter: Fix use-after-free in get_info() ...
2024-10-31Merge tag 'sound-6.12-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Here we see slightly more commits than wished, but basically all are small and mostly trivial fixes. The only core change is the workaround for __counted_by() usage in ASoC DAPM code, while the rest are device-specific fixes for Intel Baytrail devices, Cirrus and wcd937x codecs, and HD-audio / USB-audio devices" * tag 'sound-6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/realtek: Fix headset mic on TUXEDO Stellaris 16 Gen6 mb1 ALSA: hda/realtek: Fix headset mic on TUXEDO Gemini 17 Gen3 ALSA: usb-audio: Add quirks for Dell WD19 dock ASoC: codecs: wcd937x: relax the AUX PDM watchdog ASoC: codecs: wcd937x: add missing LO Switch control ASoC: dt-bindings: rockchip,rk3308-codec: add port property ALSA: hda/realtek: Add subwoofer quirk for Infinix ZERO BOOK 13 ASoC: dapm: fix bounds checker error in dapm_widget_list_create ASoC: Intel: sst: Fix used of uninitialized ctx to log an error ASoC: cs42l51: Fix some error handling paths in cs42l51_probe() ASoC: Intel: sst: Support LPE0F28 ACPI HID ALSA: hda/realtek: Limit internal Mic boost on Dell platform ASoC: Intel: bytcr_rt5640: Add DMI quirk for Vexia Edu Atla 10 tablet ASoC: Intel: bytcr_rt5640: Add support for non ACPI instantiated codec ASoC: codecs: rt5640: Always disable IRQs from rt5640_cancel_work()
2024-10-31bpf, test_run: Fix LIVE_FRAME frame update after a page has been recycledToke Høiland-Jørgensen
The test_run code detects whether a page has been modified and re-initialises the xdp_frame structure if it has, using xdp_update_frame_from_buff(). However, xdp_update_frame_from_buff() doesn't touch frame->mem, so that wasn't correctly re-initialised, which led to the pages from page_pool not being returned correctly. Syzbot noticed this as a memory leak. Fix this by also copying the frame->mem structure when re-initialising the frame, like we do on initialisation of a new page from page_pool. Fixes: e5995bc7e2ba ("bpf, test_run: fix crashes due to XDP frame overwriting/corruption") Fixes: b530e9e1063e ("bpf: Add "live packet" mode for XDP in BPF_PROG_RUN") Reported-by: syzbot+d121e098da06af416d23@syzkaller.appspotmail.com Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: syzbot+d121e098da06af416d23@syzkaller.appspotmail.com Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://lore.kernel.org/bpf/20241030-test-run-mem-fix-v1-1-41e88e8cae43@redhat.com
2024-10-31Merge tag 'nf-24-10-31' of ↵Paolo Abeni
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== The following patchset contains Netfilter fixes for net: 1) Remove unused parameters in conntrack_dump_flush.c used by selftests, from Liu Jing. 2) Fix possible UaF when removing xtables module via getsockopt() interface, from Dong Chenchen. 3) Fix potential crash in nf_send_reset6() reported by syzkaller. From Eric Dumazet 4) Validate offset and length before calling skb_checksum() in nft_payload, otherwise hitting BUG() is possible. netfilter pull request 24-10-31 * tag 'nf-24-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_payload: sanitize offset and length before calling skb_checksum() netfilter: nf_reject_ipv6: fix potential crash in nf_send_reset6() netfilter: Fix use-after-free in get_info() selftests: netfilter: remove unused parameter ==================== Link: https://patch.msgid.link/ Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-31Merge tag 'for-net-2024-10-30' of ↵Paolo Abeni
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - hci: fix null-ptr-deref in hci_read_supported_codecs * tag 'for-net-2024-10-30' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: hci: fix null-ptr-deref in hci_read_supported_codecs ==================== Link: https://patch.msgid.link/20241030192205.38298-1-luiz.dentz@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-31Merge branch 'there-are-some-bugfix-for-the-hns3-ethernet-driver'Paolo Abeni
Jijie Shao says: ==================== There are some bugfix for the HNS3 ethernet driver ChangeLog: v2 -> v3: - Rewrite the commit logs of net: hns3: add sync command to sync io-pgtable' to add more verbose explanation, suggested Paolo. - Add fixes tag for hardware issue, suggested Paolo and Simon Horman. v2: https://lore.kernel.org/all/20241018101059.1718375-1-shaojijie@huawei.com/ v1 -> v2: - Pass IRQF_NO_AUTOEN to request_irq(), suggested by Jakub. - Rewrite the commit logs of 'net: hns3: default enable tx bounce buffer when smmu enabled' and 'net: hns3: add sync command to sync io-pgtable'. v1: https://lore.kernel.org/all/20241011094521.3008298-1-shaojijie@huawei.com/ ==================== Link: https://patch.msgid.link/20241025092938.2912958-1-shaojijie@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-31net: hns3: fix kernel crash when 1588 is sent on HIP08 devicesJie Wang
Currently, HIP08 devices does not register the ptp devices, so the hdev->ptp is NULL. But the tx process would still try to set hardware time stamp info with SKBTX_HW_TSTAMP flag and cause a kernel crash. [ 128.087798] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000018 ... [ 128.280251] pc : hclge_ptp_set_tx_info+0x2c/0x140 [hclge] [ 128.286600] lr : hclge_ptp_set_tx_info+0x20/0x140 [hclge] [ 128.292938] sp : ffff800059b93140 [ 128.297200] x29: ffff800059b93140 x28: 0000000000003280 [ 128.303455] x27: ffff800020d48280 x26: ffff0cb9dc814080 [ 128.309715] x25: ffff0cb9cde93fa0 x24: 0000000000000001 [ 128.315969] x23: 0000000000000000 x22: 0000000000000194 [ 128.322219] x21: ffff0cd94f986000 x20: 0000000000000000 [ 128.328462] x19: ffff0cb9d2a166c0 x18: 0000000000000000 [ 128.334698] x17: 0000000000000000 x16: ffffcf1fc523ed24 [ 128.340934] x15: 0000ffffd530a518 x14: 0000000000000000 [ 128.347162] x13: ffff0cd6bdb31310 x12: 0000000000000368 [ 128.353388] x11: ffff0cb9cfbc7070 x10: ffff2cf55dd11e02 [ 128.359606] x9 : ffffcf1f85a212b4 x8 : ffff0cd7cf27dab0 [ 128.365831] x7 : 0000000000000a20 x6 : ffff0cd7cf27d000 [ 128.372040] x5 : 0000000000000000 x4 : 000000000000ffff [ 128.378243] x3 : 0000000000000400 x2 : ffffcf1f85a21294 [ 128.384437] x1 : ffff0cb9db520080 x0 : ffff0cb9db500080 [ 128.390626] Call trace: [ 128.393964] hclge_ptp_set_tx_info+0x2c/0x140 [hclge] [ 128.399893] hns3_nic_net_xmit+0x39c/0x4c4 [hns3] [ 128.405468] xmit_one.constprop.0+0xc4/0x200 [ 128.410600] dev_hard_start_xmit+0x54/0xf0 [ 128.415556] sch_direct_xmit+0xe8/0x634 [ 128.420246] __dev_queue_xmit+0x224/0xc70 [ 128.425101] dev_queue_xmit+0x1c/0x40 [ 128.429608] ovs_vport_send+0xac/0x1a0 [openvswitch] [ 128.435409] do_output+0x60/0x17c [openvswitch] [ 128.440770] do_execute_actions+0x898/0x8c4 [openvswitch] [ 128.446993] ovs_execute_actions+0x64/0xf0 [openvswitch] [ 128.453129] ovs_dp_process_packet+0xa0/0x224 [openvswitch] [ 128.459530] ovs_vport_receive+0x7c/0xfc [openvswitch] [ 128.465497] internal_dev_xmit+0x34/0xb0 [openvswitch] [ 128.471460] xmit_one.constprop.0+0xc4/0x200 [ 128.476561] dev_hard_start_xmit+0x54/0xf0 [ 128.481489] __dev_queue_xmit+0x968/0xc70 [ 128.486330] dev_queue_xmit+0x1c/0x40 [ 128.490856] ip_finish_output2+0x250/0x570 [ 128.495810] __ip_finish_output+0x170/0x1e0 [ 128.500832] ip_finish_output+0x3c/0xf0 [ 128.505504] ip_output+0xbc/0x160 [ 128.509654] ip_send_skb+0x58/0xd4 [ 128.513892] udp_send_skb+0x12c/0x354 [ 128.518387] udp_sendmsg+0x7a8/0x9c0 [ 128.522793] inet_sendmsg+0x4c/0x8c [ 128.527116] __sock_sendmsg+0x48/0x80 [ 128.531609] __sys_sendto+0x124/0x164 [ 128.536099] __arm64_sys_sendto+0x30/0x5c [ 128.540935] invoke_syscall+0x50/0x130 [ 128.545508] el0_svc_common.constprop.0+0x10c/0x124 [ 128.551205] do_el0_svc+0x34/0xdc [ 128.555347] el0_svc+0x20/0x30 [ 128.559227] el0_sync_handler+0xb8/0xc0 [ 128.563883] el0_sync+0x160/0x180 Fixes: 0bf5eb788512 ("net: hns3: add support for PTP") Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-31net: hns3: fixed hclge_fetch_pf_reg accesses bar space out of bounds issueHao Lan
The TQP BAR space is divided into two segments. TQPs 0-1023 and TQPs 1024-1279 are in different BAR space addresses. However, hclge_fetch_pf_reg does not distinguish the tqp space information when reading the tqp space information. When the number of TQPs is greater than 1024, access bar space overwriting occurs. The problem of different segments has been considered during the initialization of tqp.io_base. Therefore, tqp.io_base is directly used when the queue is read in hclge_fetch_pf_reg. The error message: Unable to handle kernel paging request at virtual address ffff800037200000 pc : hclge_fetch_pf_reg+0x138/0x250 [hclge] lr : hclge_get_regs+0x84/0x1d0 [hclge] Call trace: hclge_fetch_pf_reg+0x138/0x250 [hclge] hclge_get_regs+0x84/0x1d0 [hclge] hns3_get_regs+0x2c/0x50 [hns3] ethtool_get_regs+0xf4/0x270 dev_ethtool+0x674/0x8a0 dev_ioctl+0x270/0x36c sock_do_ioctl+0x110/0x2a0 sock_ioctl+0x2ac/0x530 __arm64_sys_ioctl+0xa8/0x100 invoke_syscall+0x4c/0x124 el0_svc_common.constprop.0+0x140/0x15c do_el0_svc+0x30/0xd0 el0_svc+0x1c/0x2c el0_sync_handler+0xb0/0xb4 el0_sync+0x168/0x180 Fixes: 939ccd107ffc ("net: hns3: move dump regs function to a separate file") Signed-off-by: Hao Lan <lanhao@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-31net: hns3: initialize reset_timer before hclgevf_misc_irq_init()Jian Shen
Currently the misc irq is initialized before reset_timer setup. But it will access the reset_timer in the irq handler. So initialize the reset_timer earlier. Fixes: ff200099d271 ("net: hns3: remove unnecessary work in hclgevf_main") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-31net: hns3: don't auto enable misc vectorJian Shen
Currently, there is a time window between misc irq enabled and service task inited. If an interrupte is reported at this time, it will cause warning like below: [ 16.324639] Call trace: [ 16.324641] __queue_delayed_work+0xb8/0xe0 [ 16.324643] mod_delayed_work_on+0x78/0xd0 [ 16.324655] hclge_errhand_task_schedule+0x58/0x90 [hclge] [ 16.324662] hclge_misc_irq_handle+0x168/0x240 [hclge] [ 16.324666] __handle_irq_event_percpu+0x64/0x1e0 [ 16.324667] handle_irq_event+0x80/0x170 [ 16.324670] handle_fasteoi_edge_irq+0x110/0x2bc [ 16.324671] __handle_domain_irq+0x84/0xfc [ 16.324673] gic_handle_irq+0x88/0x2c0 [ 16.324674] el1_irq+0xb8/0x140 [ 16.324677] arch_cpu_idle+0x18/0x40 [ 16.324679] default_idle_call+0x5c/0x1bc [ 16.324682] cpuidle_idle_call+0x18c/0x1c4 [ 16.324684] do_idle+0x174/0x17c [ 16.324685] cpu_startup_entry+0x30/0x6c [ 16.324687] secondary_start_kernel+0x1a4/0x280 [ 16.324688] ---[ end trace 6aa0bff672a964aa ]--- So don't auto enable misc vector when request irq.. Fixes: 7be1b9f3e99f ("net: hns3: make hclge_service use delayed workqueue") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-31net: hns3: Resolved the issue that the debugfs query result is inconsistent.Hao Lan
This patch modifies the implementation of debugfs: When the user process stops unexpectedly, not all data of the file system is read. In this case, the save_buf pointer is not released. When the user process is called next time, save_buf is used to copy the cached data to the user space. As a result, the queried data is inconsistent. To solve this problem, determine whether the function is invoked for the first time based on the value of *ppos. If *ppos is 0, obtain the actual data. Fixes: 5e69ea7ee2a6 ("net: hns3: refactor the debugfs process") Signed-off-by: Hao Lan <lanhao@huawei.com> Signed-off-by: Guangwei Zhang <zhangwangwei6@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-31net: hns3: fix missing features due to dev->features configuration too earlyHao Lan
Currently, the netdev->features is configured in hns3_nic_set_features. As a result, __netdev_update_features considers that there is no feature difference, and the procedures of the real features are missing. Fixes: 2a7556bb2b73 ("net: hns3: implement ndo_features_check ops for hns3 driver") Signed-off-by: Hao Lan <lanhao@huawei.com> Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-31net: hns3: fixed reset failure issues caused by the incorrect reset typeHao Lan
When a reset type that is not supported by the driver is input, a reset pending flag bit of the HNAE3_NONE_RESET type is generated in reset_pending. The driver does not have a mechanism to clear this type of error. As a result, the driver considers that the reset is not complete. This patch provides a mechanism to clear the HNAE3_NONE_RESET flag and the parameter of hnae3_ae_ops.set_default_reset_request is verified. The error message: hns3 0000:39:01.0: cmd failed -16 hns3 0000:39:01.0: hclge device re-init failed, VF is disabled! hns3 0000:39:01.0: failed to reset VF stack hns3 0000:39:01.0: failed to reset VF(4) hns3 0000:39:01.0: prepare reset(2) wait done hns3 0000:39:01.0 eth4: already uninitialized Use the crash tool to view struct hclgevf_dev: struct hclgevf_dev { ... default_reset_request = 0x20, reset_level = HNAE3_NONE_RESET, reset_pending = 0x100, reset_type = HNAE3_NONE_RESET, ... }; Fixes: 720bd5837e37 ("net: hns3: add set_default_reset_request in the hnae3_ae_ops") Signed-off-by: Hao Lan <lanhao@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-31net: hns3: add sync command to sync io-pgtableJian Shen
To avoid errors in pgtable prefectch, add a sync command to sync io-pagtable. This is a supplement for the previous patch. We want all the tx packet can be handled with tx bounce buffer path. But it depends on the remain space of the spare buffer, checked by the hns3_can_use_tx_bounce(). In most cases, maybe 99.99%, it returns true. But once it return false by no available space, the packet will be handled with the former path, which will map/unmap the skb buffer. Then the driver will face the smmu prefetch risk again. So add a sync command in this case to avoid smmu prefectch, just protects corner scenes. Fixes: 295ba232a8c3 ("net: hns3: add device version to replace pci revision") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peiyang Wang <wangpeiyang1@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-31net: hns3: default enable tx bounce buffer when smmu enabledPeiyang Wang
The SMMU engine on HIP09 chip has a hardware issue. SMMU pagetable prefetch features may prefetch and use a invalid PTE even the PTE is valid at that time. This will cause the device trigger fake pagefaults. The solution is to avoid prefetching by adding a SYNC command when smmu mapping a iova. But the performance of nic has a sharp drop. Then we do this workaround, always enable tx bounce buffer, avoid mapping/unmapping on TX path. This issue only affects HNS3, so we always enable tx bounce buffer when smmu enabled to improve performance. Fixes: 295ba232a8c3 ("net: hns3: add device version to replace pci revision") Signed-off-by: Peiyang Wang <wangpeiyang1@huawei.com> Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-31netfilter: nft_payload: sanitize offset and length before calling skb_checksum()Pablo Neira Ayuso
If access to offset + length is larger than the skbuff length, then skb_checksum() triggers BUG_ON(). skb_checksum() internally subtracts the length parameter while iterating over skbuff, BUG_ON(len) at the end of it checks that the expected length to be included in the checksum calculation is fully consumed. Fixes: 7ec3f7b47b8d ("netfilter: nft_payload: add packet mangling support") Reported-by: Slavin Liu <slavin-ayu@qq.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-10-30net: ethernet: mtk_wed: fix path of MT7988 WO firmwareDaniel Golle
linux-firmware commit 808cba84 ("mtk_wed: add firmware for mt7988 Wireless Ethernet Dispatcher") added mt7988_wo_{0,1}.bin in the 'mediatek/mt7988' directory while driver current expects the files in the 'mediatek' directory. Change path in the driver header now that the firmware has been added. Fixes: e2f64db13aa1 ("net: ethernet: mtk_wed: introduce WED support for MT7988") Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patch.msgid.link/Zxz0GWTR5X5LdWPe@pidgin.makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30Merge branch 'mlxsw-fixes'Jakub Kicinski
Petr Machata says: ==================== mlxsw: Fixes In this patchset: - Tx header should be pushed for each packet which is transmitted via Spectrum ASICs. Patch #1 adds a missing call to skb_cow_head() to make sure that there is both enough room to push the Tx header and that the SKB header is not cloned and can be modified. - Commit b5b60bb491b2 ("mlxsw: pci: Use page pool for Rx buffers allocation") converted mlxsw to use page pool for Rx buffers allocation. Sync for CPU and for device should be done for Rx pages. In patches #2 and #3, add the missing calls to sync pages for, respectively, CPU and the device. - Patch #4 then fixes a bug to IPv6 GRE forwarding offload. Patch #5 adds a generic forwarding test that fails with mlxsw ports prior to the fix. ==================== Link: https://patch.msgid.link/cover.1729866134.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30selftests: forwarding: Add IPv6 GRE remote change testsIdo Schimmel
Test that after changing the remote address of an ip6gre net device traffic is forwarded as expected. Test with both flat and hierarchical topologies and with and without an input / output keys. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/02b05246d2cdada0cf2fccffc0faa8a424d0f51b.1729866134.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30mlxsw: spectrum_ipip: Fix memory leak when changing remote IPv6 addressIdo Schimmel
The device stores IPv6 addresses that are used for encapsulation in linear memory that is managed by the driver. Changing the remote address of an ip6gre net device never worked properly, but since cited commit the following reproducer [1] would result in a warning [2] and a memory leak [3]. The problem is that the new remote address is never added by the driver to its hash table (and therefore the device) and the old address is never removed from it. Fix by programming the new address when the configuration of the ip6gre net device changes and removing the old one. If the address did not change, then the above would result in increasing the reference count of the address and then decreasing it. [1] # ip link add name bla up type ip6gre local 2001:db8:1::1 remote 2001:db8:2::1 tos inherit ttl inherit # ip link set dev bla type ip6gre remote 2001:db8:3::1 # ip link del dev bla # devlink dev reload pci/0000:01:00.0 [2] WARNING: CPU: 0 PID: 1682 at drivers/net/ethernet/mellanox/mlxsw/spectrum.c:3002 mlxsw_sp_ipv6_addr_put+0x140/0x1d0 Modules linked in: CPU: 0 UID: 0 PID: 1682 Comm: ip Not tainted 6.12.0-rc3-custom-g86b5b55bc835 #151 Hardware name: Nvidia SN5600/VMOD0013, BIOS 5.13 05/31/2023 RIP: 0010:mlxsw_sp_ipv6_addr_put+0x140/0x1d0 [...] Call Trace: <TASK> mlxsw_sp_router_netdevice_event+0x55f/0x1240 notifier_call_chain+0x5a/0xd0 call_netdevice_notifiers_info+0x39/0x90 unregister_netdevice_many_notify+0x63e/0x9d0 rtnl_dellink+0x16b/0x3a0 rtnetlink_rcv_msg+0x142/0x3f0 netlink_rcv_skb+0x50/0x100 netlink_unicast+0x242/0x390 netlink_sendmsg+0x1de/0x420 ____sys_sendmsg+0x2bd/0x320 ___sys_sendmsg+0x9a/0xe0 __sys_sendmsg+0x7a/0xd0 do_syscall_64+0x9e/0x1a0 entry_SYSCALL_64_after_hwframe+0x77/0x7f [3] unreferenced object 0xffff898081f597a0 (size 32): comm "ip", pid 1626, jiffies 4294719324 hex dump (first 32 bytes): 20 01 0d b8 00 02 00 00 00 00 00 00 00 00 00 01 ............... 21 49 61 83 80 89 ff ff 00 00 00 00 01 00 00 00 !Ia............. backtrace (crc fd9be911): [<00000000df89c55d>] __kmalloc_cache_noprof+0x1da/0x260 [<00000000ff2a1ddb>] mlxsw_sp_ipv6_addr_kvdl_index_get+0x281/0x340 [<000000009ddd445d>] mlxsw_sp_router_netdevice_event+0x47b/0x1240 [<00000000743e7757>] notifier_call_chain+0x5a/0xd0 [<000000007c7b9e13>] call_netdevice_notifiers_info+0x39/0x90 [<000000002509645d>] register_netdevice+0x5f7/0x7a0 [<00000000c2e7d2a9>] ip6gre_newlink_common.isra.0+0x65/0x130 [<0000000087cd6d8d>] ip6gre_newlink+0x72/0x120 [<000000004df7c7cc>] rtnl_newlink+0x471/0xa20 [<0000000057ed632a>] rtnetlink_rcv_msg+0x142/0x3f0 [<0000000032e0d5b5>] netlink_rcv_skb+0x50/0x100 [<00000000908bca63>] netlink_unicast+0x242/0x390 [<00000000cdbe1c87>] netlink_sendmsg+0x1de/0x420 [<0000000011db153e>] ____sys_sendmsg+0x2bd/0x320 [<000000003b6d53eb>] ___sys_sendmsg+0x9a/0xe0 [<00000000cae27c62>] __sys_sendmsg+0x7a/0xd0 Fixes: cf42911523e0 ("mlxsw: spectrum_ipip: Use common hash table for IPv6 address mapping") Reported-by: Maksym Yaremchuk <maksymy@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/e91012edc5a6cb9df37b78fd377f669381facfcb.1729866134.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30mlxsw: pci: Sync Rx buffers for deviceAmit Cohen
Non-coherent architectures, like ARM, may require invalidating caches before the device can use the DMA mapped memory, which means that before posting pages to device, drivers should sync the memory for device. Sync for device can be configured as page pool responsibility. Set the relevant flag and define max_len for sync. Cc: Jiri Pirko <jiri@resnulli.us> Fixes: b5b60bb491b2 ("mlxsw: pci: Use page pool for Rx buffers allocation") Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/92e01f05c4f506a4f0a9b39c10175dcc01994910.1729866134.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30mlxsw: pci: Sync Rx buffers for CPUAmit Cohen
When Rx packet is received, drivers should sync the pages for CPU, to ensure the CPU reads the data written by the device and not stale data from its cache. Add the missing sync call in Rx path, sync the actual length of data for each fragment. Cc: Jiri Pirko <jiri@resnulli.us> Fixes: b5b60bb491b2 ("mlxsw: pci: Use page pool for Rx buffers allocation") Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/461486fac91755ca4e04c2068c102250026dcd0b.1729866134.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30mlxsw: spectrum_ptp: Add missing verification before pushing Tx headerAmit Cohen
Tx header should be pushed for each packet which is transmitted via Spectrum ASICs. The cited commit moved the call to skb_cow_head() from mlxsw_sp_port_xmit() to functions which handle Tx header. In case that mlxsw_sp->ptp_ops->txhdr_construct() is used to handle Tx header, and txhdr_construct() is mlxsw_sp_ptp_txhdr_construct(), there is no call for skb_cow_head() before pushing Tx header size to SKB. This flow is relevant for Spectrum-1 and Spectrum-4, for PTP packets. Add the missing call to skb_cow_head() to make sure that there is both enough room to push the Tx header and that the SKB header is not cloned and can be modified. An additional set will be sent to net-next to centralize the handling of the Tx header by pushing it to every packet just before transmission. Cc: Richard Cochran <richardcochran@gmail.com> Fixes: 24157bc69f45 ("mlxsw: Send PTP packets as data packets to overcome a limitation") Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/5145780b07ebbb5d3b3570f311254a3a2d554a44.1729866134.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30net: skip offload for NETIF_F_IPV6_CSUM if ipv6 header contains extensionBenoît Monin
As documented in skbuff.h, devices with NETIF_F_IPV6_CSUM capability can only checksum TCP and UDP over IPv6 if the IP header does not contains extension. This is enforced for UDP packets emitted from user-space to an IPv6 address as they go through ip6_make_skb(), which calls __ip6_append_data() where a check is done on the header size before setting CHECKSUM_PARTIAL. But the introduction of UDP encapsulation with fou6 added a code-path where it is possible to get an skb with a partial UDP checksum and an IPv6 header with extension: * fou6 adds a UDP header with a partial checksum if the inner packet does not contains a valid checksum. * ip6_tunnel adds an IPv6 header with a destination option extension header if encap_limit is non-zero (the default value is 4). The thread linked below describes in more details how to reproduce the problem with GRE-in-UDP tunnel. Add a check on the network header size in skb_csum_hwoffload_help() to make sure no IPv6 packet with extension header is handed to a network device with NETIF_F_IPV6_CSUM capability. Link: https://lore.kernel.org/netdev/26548921.1r3eYUQgxm@benoit.monin/T/#u Fixes: aa3463d65e7b ("fou: Add encap ops for IPv6 tunnels") Signed-off-by: Benoît Monin <benoit.monin@gmx.fr> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/5fbeecfc311ea182aa1d1c771725ab8b4cac515e.1729778144.git.benoit.monin@gmx.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-30x86/uaccess: Avoid barrier_nospec() in 64-bit copy_from_user()Linus Torvalds
The barrier_nospec() in 64-bit copy_from_user() is slow. Instead use pointer masking to force the user pointer to all 1's for an invalid address. The kernel test robot reports a 2.6% improvement in the per_thread_ops benchmark [1]. This is a variation on a patch originally by Josh Poimboeuf [2]. Link: https://lore.kernel.org/202410281344.d02c72a2-oliver.sang@intel.com [1] Link: https://lore.kernel.org/5b887fe4c580214900e21f6c61095adf9a142735.1730166635.git.jpoimboe@kernel.org [2] Tested-and-reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-10-30Merge tag 'perf-tools-fixes-for-v6.12-2-2024-10-30' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Arnaldo Carvalho de Melo: - Update more header copies with the kernel sources, including const.h, msr-index.h, arm64's cputype.h, kvm's, bits.h and unaligned.h - The return from 'write' isn't a pid, fix cut'n'paste error in 'perf trace' - Fix up the python binding build on architectures without HAVE_KVM_STAT_SUPPORT - Add some more bounds checks to augmented_raw_syscalls.bpf.c (used to collect syscall pointer arguments in 'perf trace') to make the resulting bytecode to pass the kernel BPF verifier, allowing us to go back accepting clang 12.0.1 as the minimum version required for compiling BPF sources - Add __NR_capget for x86 to fix a regression on running perf + intel PT (hw tracing) as non-root setting up the capabilities as described in https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html - Fix missing syscalltbl in non-explicitly listed architectures, noticed on ARM 32-bit, that still needs a .tbl generator for the syscall id<->name tables, should be added for v6.13 - Handle 'perf test' failure when handling broken DWARF for ASM files * tag 'perf-tools-fixes-for-v6.12-2-2024-10-30' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: perf cap: Add __NR_capget to arch/x86 unistd tools headers: Update the linux/unaligned.h copy with the kernel sources tools headers arm64: Sync arm64's cputype.h with the kernel sources tools headers: Synchronize {uapi/}linux/bits.h with the kernel sources tools arch x86: Sync the msr-index.h copy with the kernel sources perf python: Fix up the build on architectures without HAVE_KVM_STAT_SUPPORT perf test: Handle perftool-testsuite_probe failure due to broken DWARF tools headers UAPI: Sync kvm headers with the kernel sources perf trace: Fix non-listed archs in the syscalltbl routines perf build: Change the clang check back to 12.0.1 perf trace augmented_raw_syscalls: Add more checks to pass the verifier perf trace augmented_raw_syscalls: Add extra array index bounds checking to satisfy some BPF verifiers perf trace: The return from 'write' isn't a pid tools headers UAPI: Sync linux/const.h with the kernel headers
2024-10-30Merge branch 'fixes-for-bits-iterator'Alexei Starovoitov
Hou Tao says: ==================== The patch set fixes several issues in bits iterator. Patch #1 fixes the kmemleak problem of bits iterator. Patch #2~#3 fix the overflow problem of nr_bits. Patch #4 fixes the potential stack corruption when bits iterator is used on 32-bit host. Patch #5 adds more test cases for bits iterator. Please see the individual patches for more details. And comments are always welcome. --- v4: * patch #1: add ack from Yafang * patch #3: revert code-churn like changes: (1) compute nr_bytes and nr_bits before the check of nr_words. (2) use nr_bits == 64 to check for single u64, preventing build warning on 32-bit hosts. * patch #4: use "BITS_PER_LONG == 32" instead of "!defined(CONFIG_64BIT)" v3: https://lore.kernel.org/bpf/20241025013233.804027-1-houtao@huaweicloud.com/T/#t * split the bits-iterator related patches from "Misc fixes for bpf" patch set * patch #1: use "!nr_bits || bits >= nr_bits" to stop the iteration * patch #2: add a new helper for the overflow problem * patch #3: decrease the limitation from 512 to 511 and check whether nr_bytes is too large for bpf memory allocator explicitly * patch #5: add two more test cases for bit iterator v2: http://lore.kernel.org/bpf/d49fa2f4-f743-c763-7579-c3cab4dd88cb@huaweicloud.com ==================== Link: https://lore.kernel.org/r/20241030100516.3633640-1-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-30selftests/bpf: Add three test cases for bits_iterHou Tao
Add more test cases for bits iterator: (1) huge word test Verify the multiplication overflow of nr_bits in bits_iter. Without the overflow check, when nr_words is 67108865, nr_bits becomes 64, causing bpf_probe_read_kernel_common() to corrupt the stack. (2) max word test Verify correct handling of maximum nr_words value (511). (3) bad word test Verify early termination of bits iteration when bits iterator initialization fails. Also rename bits_nomem to bits_too_big to better reflect its purpose. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241030100516.3633640-6-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-30bpf: Use __u64 to save the bits in bits iteratorHou Tao
On 32-bit hosts (e.g., arm32), when a bpf program passes a u64 to bpf_iter_bits_new(), bpf_iter_bits_new() will use bits_copy to store the content of the u64. However, bits_copy is only 4 bytes, leading to stack corruption. The straightforward solution would be to replace u64 with unsigned long in bpf_iter_bits_new(). However, this introduces confusion and problems for 32-bit hosts because the size of ulong in bpf program is 8 bytes, but it is treated as 4-bytes after passed to bpf_iter_bits_new(). Fix it by changing the type of both bits and bit_count from unsigned long to u64. However, the change is not enough. The main reason is that bpf_iter_bits_next() uses find_next_bit() to find the next bit and the pointer passed to find_next_bit() is an unsigned long pointer instead of a u64 pointer. For 32-bit little-endian host, it is fine but it is not the case for 32-bit big-endian host. Because under 32-bit big-endian host, the first iterated unsigned long will be the bits 32-63 of the u64 instead of the expected bits 0-31. Therefore, in addition to changing the type, swap the two unsigned longs within the u64 for 32-bit big-endian host. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241030100516.3633640-5-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-30bpf: Check the validity of nr_words in bpf_iter_bits_new()Hou Tao
Check the validity of nr_words in bpf_iter_bits_new(). Without this check, when multiplication overflow occurs for nr_bits (e.g., when nr_words = 0x0400-0001, nr_bits becomes 64), stack corruption may occur due to bpf_probe_read_kernel_common(..., nr_bytes = 0x2000-0008). Fix it by limiting the maximum value of nr_words to 511. The value is derived from the current implementation of BPF memory allocator. To ensure compatibility if the BPF memory allocator's size limitation changes in the future, use the helper bpf_mem_alloc_check_size() to check whether nr_bytes is too larger. And return -E2BIG instead of -ENOMEM for oversized nr_bytes. Fixes: 4665415975b0 ("bpf: Add bits iterator") Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241030100516.3633640-4-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-30bpf: Add bpf_mem_alloc_check_size() helperHou Tao
Introduce bpf_mem_alloc_check_size() to check whether the allocation size exceeds the limitation for the kmalloc-equivalent allocator. The upper limit for percpu allocation is LLIST_NODE_SZ bytes larger than non-percpu allocation, so a percpu argument is added to the helper. The helper will be used in the following patch to check whether the size parameter passed to bpf_mem_alloc() is too big. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241030100516.3633640-3-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-30bpf: Free dynamically allocated bits in bpf_iter_bits_destroy()Hou Tao
bpf_iter_bits_destroy() uses "kit->nr_bits <= 64" to check whether the bits are dynamically allocated. However, the check is incorrect and may cause a kmemleak as shown below: unreferenced object 0xffff88812628c8c0 (size 32): comm "swapper/0", pid 1, jiffies 4294727320 hex dump (first 32 bytes): b0 c1 55 f5 81 88 ff ff f0 f0 f0 f0 f0 f0 f0 f0 ..U........... f0 f0 f0 f0 f0 f0 f0 f0 00 00 00 00 00 00 00 00 .............. backtrace (crc 781e32cc): [<00000000c452b4ab>] kmemleak_alloc+0x4b/0x80 [<0000000004e09f80>] __kmalloc_node_noprof+0x480/0x5c0 [<00000000597124d6>] __alloc.isra.0+0x89/0xb0 [<000000004ebfffcd>] alloc_bulk+0x2af/0x720 [<00000000d9c10145>] prefill_mem_cache+0x7f/0xb0 [<00000000ff9738ff>] bpf_mem_alloc_init+0x3e2/0x610 [<000000008b616eac>] bpf_global_ma_init+0x19/0x30 [<00000000fc473efc>] do_one_initcall+0xd3/0x3c0 [<00000000ec81498c>] kernel_init_freeable+0x66a/0x940 [<00000000b119f72f>] kernel_init+0x20/0x160 [<00000000f11ac9a7>] ret_from_fork+0x3c/0x70 [<0000000004671da4>] ret_from_fork_asm+0x1a/0x30 That is because nr_bits will be set as zero in bpf_iter_bits_next() after all bits have been iterated. Fix the issue by setting kit->bit to kit->nr_bits instead of setting kit->nr_bits to zero when the iteration completes in bpf_iter_bits_next(). In addition, use "!nr_bits || bits >= nr_bits" to check whether the iteration is complete and still use "nr_bits > 64" to indicate whether bits are dynamically allocated. The "!nr_bits" check is necessary because bpf_iter_bits_new() may fail before setting kit->nr_bits, and this condition will stop the iteration early instead of accessing the zeroed or freed kit->bits. Considering the initial value of kit->bits is -1 and the type of kit->nr_bits is unsigned int, change the type of kit->nr_bits to int. The potential overflow problem will be handled in the following patch. Fixes: 4665415975b0 ("bpf: Add bits iterator") Acked-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241030100516.3633640-2-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-30Bluetooth: hci: fix null-ptr-deref in hci_read_supported_codecsSungwoo Kim
Fix __hci_cmd_sync_sk() to return not NULL for unknown opcodes. __hci_cmd_sync_sk() returns NULL if a command returns a status event. However, it also returns NULL where an opcode doesn't exist in the hci_cc table because hci_cmd_complete_evt() assumes status = skb->data[0] for unknown opcodes. This leads to null-ptr-deref in cmd_sync for HCI_OP_READ_LOCAL_CODECS as there is no hci_cc for HCI_OP_READ_LOCAL_CODECS, which always assumes status = skb->data[0]. KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077] CPU: 1 PID: 2000 Comm: kworker/u9:5 Not tainted 6.9.0-ga6bcb805883c-dirty #10 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Workqueue: hci7 hci_power_on RIP: 0010:hci_read_supported_codecs+0xb9/0x870 net/bluetooth/hci_codec.c:138 Code: 08 48 89 ef e8 b8 c1 8f fd 48 8b 75 00 e9 96 00 00 00 49 89 c6 48 ba 00 00 00 00 00 fc ff df 4c 8d 60 70 4c 89 e3 48 c1 eb 03 <0f> b6 04 13 84 c0 0f 85 82 06 00 00 41 83 3c 24 02 77 0a e8 bf 78 RSP: 0018:ffff888120bafac8 EFLAGS: 00010212 RAX: 0000000000000000 RBX: 000000000000000e RCX: ffff8881173f0040 RDX: dffffc0000000000 RSI: ffffffffa58496c0 RDI: ffff88810b9ad1e4 RBP: ffff88810b9ac000 R08: ffffffffa77882a7 R09: 1ffffffff4ef1054 R10: dffffc0000000000 R11: fffffbfff4ef1055 R12: 0000000000000070 R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810b9ac000 FS: 0000000000000000(0000) GS:ffff8881f6c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f6ddaa3439e CR3: 0000000139764003 CR4: 0000000000770ef0 PKRU: 55555554 Call Trace: <TASK> hci_read_local_codecs_sync net/bluetooth/hci_sync.c:4546 [inline] hci_init_stage_sync net/bluetooth/hci_sync.c:3441 [inline] hci_init4_sync net/bluetooth/hci_sync.c:4706 [inline] hci_init_sync net/bluetooth/hci_sync.c:4742 [inline] hci_dev_init_sync net/bluetooth/hci_sync.c:4912 [inline] hci_dev_open_sync+0x19a9/0x2d30 net/bluetooth/hci_sync.c:4994 hci_dev_do_open net/bluetooth/hci_core.c:483 [inline] hci_power_on+0x11e/0x560 net/bluetooth/hci_core.c:1015 process_one_work kernel/workqueue.c:3267 [inline] process_scheduled_works+0x8ef/0x14f0 kernel/workqueue.c:3348 worker_thread+0x91f/0xe50 kernel/workqueue.c:3429 kthread+0x2cb/0x360 kernel/kthread.c:388 ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Fixes: abfeea476c68 ("Bluetooth: hci_sync: Convert MGMT_OP_START_DISCOVERY") Signed-off-by: Sungwoo Kim <iam@sung-woo.kim> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-10-30Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two small fixes, both in drivers (ufs and scsi_debug)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: core: Fix another deadlock during RTC update scsi: scsi_debug: Fix do_device_access() handling of unexpected SG copy length
2024-10-30ALSA: hda/realtek: Fix headset mic on TUXEDO Stellaris 16 Gen6 mb1Christoffer Sandberg
Quirk is needed to enable headset microphone on missing pin 0x19. Signed-off-by: Christoffer Sandberg <cs@tuxedo.de> Signed-off-by: Werner Sembach <wse@tuxedocomputers.com> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20241029151653.80726-2-wse@tuxedocomputers.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-10-30ALSA: hda/realtek: Fix headset mic on TUXEDO Gemini 17 Gen3Christoffer Sandberg
Quirk is needed to enable headset microphone on missing pin 0x19. Signed-off-by: Christoffer Sandberg <cs@tuxedo.de> Signed-off-by: Werner Sembach <wse@tuxedocomputers.com> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20241029151653.80726-1-wse@tuxedocomputers.com Signed-off-by: Takashi Iwai <tiwai@suse.de>