summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2021-11-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
drivers/net/ipa/ipa_main.c 8afc7e471ad3 ("net: ipa: separate disabling setup from modem stop") 76b5fbcd6b47 ("net: ipa: kill ipa_modem_init()") Duplicated include, drop one. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26net/smc: Don't call clcsock shutdown twice when smc shutdownTony Lu
When applications call shutdown() with SHUT_RDWR in userspace, smc_close_active() calls kernel_sock_shutdown(), and it is called twice in smc_shutdown(). This fixes this by checking sk_state before do clcsock shutdown, and avoids missing the application's call of smc_shutdown(). Link: https://lore.kernel.org/linux-s390/1f67548e-cbf6-0dce-82b5-10288a4583bd@linux.ibm.com/ Fixes: 606a63c9783a ("net/smc: Ensure the active closing peer first closes clcsock") Signed-off-by: Tony Lu <tonylu@linux.alibaba.com> Reviewed-by: Wen Gu <guwen@linux.alibaba.com> Acked-by: Karsten Graul <kgraul@linux.ibm.com> Link: https://lore.kernel.org/r/20211126024134.45693-1-tonylu@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26net: vlan: fix underflow for the real_dev refcntZiyang Xuan
Inject error before dev_hold(real_dev) in register_vlan_dev(), and execute the following testcase: ip link add dev dummy1 type dummy ip link add name dummy1.100 link dummy1 type vlan id 100 ip link del dev dummy1 When the dummy netdevice is removed, we will get a WARNING as following: ======================================================================= refcount_t: decrement hit 0; leaking memory. WARNING: CPU: 2 PID: 0 at lib/refcount.c:31 refcount_warn_saturate+0xbf/0x1e0 and an endless loop of: ======================================================================= unregister_netdevice: waiting for dummy1 to become free. Usage count = -1073741824 That is because dev_put(real_dev) in vlan_dev_free() be called without dev_hold(real_dev) in register_vlan_dev(). It makes the refcnt of real_dev underflow. Move the dev_hold(real_dev) to vlan_dev_init() which is the call-back of ndo_init(). That makes dev_hold() and dev_put() for vlan's real_dev symmetrical. Fixes: 563bcbae3ba2 ("net: vlan: fix a UAF in vlan_dev_real_dev()") Reported-by: Petr Machata <petrm@nvidia.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Link: https://lore.kernel.org/r/20211126015942.2918542-1-william.xuanziyang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26ethtool: ioctl: fix potential NULL deref in ethtool_set_coalesce()Julian Wiedmann
ethtool_set_coalesce() now uses both the .get_coalesce() and .set_coalesce() callbacks. But the check for their availability is buggy, so changing the coalesce settings on a device where the driver provides only _one_ of the callbacks results in a NULL pointer dereference instead of an -EOPNOTSUPP. Fix the condition so that the availability of both callbacks is ensured. This also matches the netlink code. Note that reproducing this requires some effort - it only affects the legacy ioctl path, and needs a specific combination of driver options: - have .get_coalesce() and .coalesce_supported but no .set_coalesce(), or - have .set_coalesce() but no .get_coalesce(). Here eg. ethtool doesn't cause the crash as it first attempts to call ethtool_get_coalesce() and bails out on error. Fixes: f3ccfda19319 ("ethtool: extend coalesce setting uAPI with CQE mode") Cc: Yufeng Mo <moyufeng@huawei.com> Cc: Huazhong Tan <tanhuazhong@huawei.com> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Link: https://lore.kernel.org/r/20211126175543.28000-1-jwi@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26net/sched: sch_ets: don't peek at classes beyond 'nbands'Davide Caratti
when the number of DRR classes decreases, the round-robin active list can contain elements that have already been freed in ets_qdisc_change(). As a consequence, it's possible to see a NULL dereference crash, caused by the attempt to call cl->qdisc->ops->peek(cl->qdisc) when cl->qdisc is NULL: BUG: kernel NULL pointer dereference, address: 0000000000000018 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 1 PID: 910 Comm: mausezahn Not tainted 5.16.0-rc1+ #475 Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014 RIP: 0010:ets_qdisc_dequeue+0x129/0x2c0 [sch_ets] Code: c5 01 41 39 ad e4 02 00 00 0f 87 18 ff ff ff 49 8b 85 c0 02 00 00 49 39 c4 0f 84 ba 00 00 00 49 8b ad c0 02 00 00 48 8b 7d 10 <48> 8b 47 18 48 8b 40 38 0f ae e8 ff d0 48 89 c3 48 85 c0 0f 84 9d RSP: 0000:ffffbb36c0b5fdd8 EFLAGS: 00010287 RAX: ffff956678efed30 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000002 RSI: ffffffff9b938dc9 RDI: 0000000000000000 RBP: ffff956678efed30 R08: e2f3207fe360129c R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff956678efeac0 R13: ffff956678efe800 R14: ffff956611545000 R15: ffff95667ac8f100 FS: 00007f2aa9120740(0000) GS:ffff95667b800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000018 CR3: 000000011070c000 CR4: 0000000000350ee0 Call Trace: <TASK> qdisc_peek_dequeued+0x29/0x70 [sch_ets] tbf_dequeue+0x22/0x260 [sch_tbf] __qdisc_run+0x7f/0x630 net_tx_action+0x290/0x4c0 __do_softirq+0xee/0x4f8 irq_exit_rcu+0xf4/0x130 sysvec_apic_timer_interrupt+0x52/0xc0 asm_sysvec_apic_timer_interrupt+0x12/0x20 RIP: 0033:0x7f2aa7fc9ad4 Code: b9 ff ff 48 8b 54 24 18 48 83 c4 08 48 89 ee 48 89 df 5b 5d e9 ed fc ff ff 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa <53> 48 83 ec 10 48 8b 05 10 64 33 00 48 8b 00 48 85 c0 0f 85 84 00 RSP: 002b:00007ffe5d33fab8 EFLAGS: 00000202 RAX: 0000000000000002 RBX: 0000561f72c31460 RCX: 0000561f72c31720 RDX: 0000000000000002 RSI: 0000561f72c31722 RDI: 0000561f72c31720 RBP: 000000000000002a R08: 00007ffe5d33fa40 R09: 0000000000000014 R10: 0000000000000000 R11: 0000000000000246 R12: 0000561f7187e380 R13: 0000000000000000 R14: 0000000000000000 R15: 0000561f72c31460 </TASK> Modules linked in: sch_ets sch_tbf dummy rfkill iTCO_wdt intel_rapl_msr iTCO_vendor_support intel_rapl_common joydev virtio_balloon lpc_ich i2c_i801 i2c_smbus pcspkr ip_tables xfs libcrc32c crct10dif_pclmul crc32_pclmul crc32c_intel ahci libahci ghash_clmulni_intel serio_raw libata virtio_blk virtio_console virtio_net net_failover failover sunrpc dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000018 Ensuring that 'alist' was never zeroed [1] was not sufficient, we need to remove from the active list those elements that are no more SP nor DRR. [1] https://lore.kernel.org/netdev/60d274838bf09777f0371253416e8af71360bc08.1633609148.git.dcaratti@redhat.com/ v3: fix race between ets_qdisc_change() and ets_qdisc_dequeue() delisting DRR classes beyond 'nbands' in ets_qdisc_change() with the qdisc lock acquired, thanks to Cong Wang. v2: when a NULL qdisc is found in the DRR active list, try to dequeue skb from the next list item. Reported-by: Hangbin Liu <liuhangbin@gmail.com> Fixes: dcc68b4d8084 ("net: sch_ets: Add a new Qdisc") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Link: https://lore.kernel.org/r/7a5c496eed2d62241620bdbb83eb03fb9d571c99.1637762721.git.dcaratti@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-26mac80211: notify non-transmitting BSS of color changesJohn Crispin
When color change is triggered in multiple bssid case, allow only for transmitting BSS, and when it changes its bss color, notify the non transmitting BSSs also of the new bss color. Signed-off-by: John Crispin <john@phrozen.org> Co-developed-by: Lavanya Suresh <lavaks@codeaurora.org> Signed-off-by: Lavanya Suresh <lavaks@codeaurora.org> Co-developed-by: Rameshkumar Sundaram <quic_ramess@quicinc.com> Signed-off-by: Rameshkumar Sundaram <quic_ramess@quicinc.com> Link: https://lore.kernel.org/r/1637146647-16282-1-git-send-email-quic_ramess@quicinc.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-11-26mac80211: minstrel_ht: remove unused SAMPLE_SWITCH_THR definePeter Seiderer
Remove unused SAMPLE_SWITCH_THR define. Signed-off-by: Peter Seiderer <ps.report@gmx.net> Link: https://lore.kernel.org/r/20211116221244.30844-1-ps.report@gmx.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-11-26cfg80211: allow continuous radar monitoring on offchannel chainLorenzo Bianconi
Allow continuous radar detection on the offchannel chain in order to switch to the monitored channel whenever the underlying driver reports a radar pattern on the main channel. Tested-by: Owen Peng <owen.peng@mediatek.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/d46217310a49b14ff0e9c002f0a6e0547d70fd2c.1637071350.git.lorenzo@kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-11-26cfg80211: schedule offchan_cac_abort_wk in cfg80211_radar_eventLorenzo Bianconi
If necessary schedule offchan_cac_abort_wk work in cfg80211_radar_event routine adding offchan parameter to cfg80211_radar_event signature. Rename cfg80211_radar_event in __cfg80211_radar_event and introduce the two following inline helpers: - cfg80211_radar_event - cfg80211_offchan_radar_event Doing so the drv will not need to run cfg80211_offchan_cac_abort() after radar detection on the offchannel chain. Tested-by: Owen Peng <owen.peng@mediatek.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/3ff583e021e3343a3ced54a7b09b5e184d1880dc.1637062727.git.lorenzo@kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-11-26cfg80211: delete redundant free codeliuguoqiang
When kzalloc failed and rdev->sacn_req or rdev->scan_msg is null, pass a null pointer to kfree is redundant, delete it and return directly. Signed-off-by: liuguoqiang <liuguoqiang@uniontech.com> Link: https://lore.kernel.org/r/20211115092139.24407-1-liuguoqiang@uniontech.com [remove now unused creq = NULL assigment] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-11-26mac80211: add support for .ndo_fill_forward_pathFelix Fietkau
This allows drivers to provide a destination device + info for flow offload Only supported in combination with 802.3 encap offload Signed-off-by: Felix Fietkau <nbd@nbd.name> Tested-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/20211112112223.1209-1-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-11-26mac80211: Remove unused assignment statementsluo penghao
The assignment of these three local variables in the file will not be used in the corresponding functions, so they should be deleted. The clang_analyzer complains as follows: net/mac80211/wpa.c:689:2 warning: net/mac80211/wpa.c:883:2 warning: net/mac80211/wpa.c:452:2 warning: Value stored to 'hdr' is never read Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: luo penghao <luo.penghao@zte.com.cn> Link: https://lore.kernel.org/r/20211104061411.1744-1-luo.penghao@zte.com.cn Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-11-26cfg80211: fix possible NULL pointer dereference in ↵Lorenzo Bianconi
cfg80211_stop_offchan_radar_detection Fix the following NULL pointer dereference in cfg80211_stop_offchan_radar_detection routine that occurs when hostapd is stopped during the CAC on offchannel chain: Sat Jan 1 0[ 779.567851] ESR = 0x96000005 0:12:50 2000 dae[ 779.572346] EC = 0x25: DABT (current EL), IL = 32 bits mon.debug hostap[ 779.578984] SET = 0, FnV = 0 d: hostapd_inter[ 779.583445] EA = 0, S1PTW = 0 face_deinit_free[ 779.587936] Data abort info: : num_bss=1 conf[ 779.592224] ISV = 0, ISS = 0x00000005 ->num_bss=1 Sat[ 779.597403] CM = 0, WnR = 0 Jan 1 00:12:50[ 779.601749] user pgtable: 4k pages, 39-bit VAs, pgdp=00000000418b2000 2000 daemon.deb[ 779.609601] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 ug hostapd: host[ 779.619657] Internal error: Oops: 96000005 [#1] SMP [ 779.770810] CPU: 0 PID: 2202 Comm: hostapd Not tainted 5.10.75 #0 [ 779.776892] Hardware name: MediaTek MT7622 RFB1 board (DT) [ 779.782370] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--) [ 779.788384] pc : cfg80211_chandef_valid+0x10/0x490 [cfg80211] [ 779.794128] lr : cfg80211_check_station_change+0x3190/0x3950 [cfg80211] [ 779.800731] sp : ffffffc01204b7e0 [ 779.804036] x29: ffffffc01204b7e0 x28: ffffff80039bdc00 [ 779.809340] x27: 0000000000000000 x26: ffffffc008cb3050 [ 779.814644] x25: 0000000000000000 x24: 0000000000000002 [ 779.819948] x23: ffffff8002630000 x22: ffffff8003e748d0 [ 779.825252] x21: 0000000000000cc0 x20: ffffff8003da4a00 [ 779.830556] x19: 0000000000000000 x18: ffffff8001bf7ce0 [ 779.835860] x17: 00000000ffffffff x16: 0000000000000000 [ 779.841164] x15: 0000000040d59200 x14: 00000000000019c0 [ 779.846467] x13: 00000000000001c8 x12: 000636b9e9dab1c6 [ 779.851771] x11: 0000000000000141 x10: 0000000000000820 [ 779.857076] x9 : 0000000000000000 x8 : ffffff8003d7d038 [ 779.862380] x7 : 0000000000000000 x6 : ffffff8003d7d038 [ 779.867683] x5 : 0000000000000e90 x4 : 0000000000000038 [ 779.872987] x3 : 0000000000000002 x2 : 0000000000000004 [ 779.878291] x1 : 0000000000000000 x0 : 0000000000000000 [ 779.883594] Call trace: [ 779.886039] cfg80211_chandef_valid+0x10/0x490 [cfg80211] [ 779.891434] cfg80211_check_station_change+0x3190/0x3950 [cfg80211] [ 779.897697] nl80211_radar_notify+0x138/0x19c [cfg80211] [ 779.903005] cfg80211_stop_offchan_radar_detection+0x7c/0x8c [cfg80211] [ 779.909616] __cfg80211_leave+0x2c/0x190 [cfg80211] [ 779.914490] cfg80211_register_netdevice+0x1c0/0x6d0 [cfg80211] [ 779.920404] raw_notifier_call_chain+0x50/0x70 [ 779.924841] call_netdevice_notifiers_info+0x54/0xa0 [ 779.929796] __dev_close_many+0x40/0x100 [ 779.933712] __dev_change_flags+0x98/0x190 [ 779.937800] dev_change_flags+0x20/0x60 [ 779.941628] devinet_ioctl+0x534/0x6d0 [ 779.945370] inet_ioctl+0x1bc/0x230 [ 779.948849] sock_do_ioctl+0x44/0x200 [ 779.952502] sock_ioctl+0x268/0x4c0 [ 779.955985] __arm64_sys_ioctl+0xac/0xd0 [ 779.959900] el0_svc_common.constprop.0+0x60/0x110 [ 779.964682] do_el0_svc+0x1c/0x24 [ 779.967990] el0_svc+0x10/0x1c [ 779.971036] el0_sync_handler+0x9c/0x120 [ 779.974950] el0_sync+0x148/0x180 [ 779.978259] Code: a9bc7bfd 910003fd a90153f3 aa0003f3 (f9400000) [ 779.984344] ---[ end trace 0e67b4f5d6cdeec7 ]--- [ 779.996400] Kernel panic - not syncing: Oops: Fatal exception [ 780.002139] SMP: stopping secondary CPUs [ 780.006057] Kernel Offset: disabled [ 780.009537] CPU features: 0x0000002,04002004 [ 780.013796] Memory Limit: none Fixes: b8f5facf286b ("cfg80211: implement APIs for dedicated radar detection HW") Reported-by: Evelyn Tsai <evelyn.tsai@mediatek.com> Tested-by: Evelyn Tsai <evelyn.tsai@mediatek.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/c2e34c065bf8839c5ffa45498ae154021a72a520.1635958796.git.lorenzo@kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-11-26mac80211: fix a memory leak where sta_info is not freedAhmed Zaki
The following is from a system that went OOM due to a memory leak: wlan0: Allocated STA 74:83:c2:64:0b:87 wlan0: Allocated STA 74:83:c2:64:0b:87 wlan0: IBSS finish 74:83:c2:64:0b:87 (---from ieee80211_ibss_add_sta) wlan0: Adding new IBSS station 74:83:c2:64:0b:87 wlan0: moving STA 74:83:c2:64:0b:87 to state 2 wlan0: moving STA 74:83:c2:64:0b:87 to state 3 wlan0: Inserted STA 74:83:c2:64:0b:87 wlan0: IBSS finish 74:83:c2:64:0b:87 (---from ieee80211_ibss_work) wlan0: Adding new IBSS station 74:83:c2:64:0b:87 wlan0: moving STA 74:83:c2:64:0b:87 to state 2 wlan0: moving STA 74:83:c2:64:0b:87 to state 3 . . wlan0: expiring inactive not authorized STA 74:83:c2:64:0b:87 wlan0: moving STA 74:83:c2:64:0b:87 to state 2 wlan0: moving STA 74:83:c2:64:0b:87 to state 1 wlan0: Removed STA 74:83:c2:64:0b:87 wlan0: Destroyed STA 74:83:c2:64:0b:87 The ieee80211_ibss_finish_sta() is called twice on the same STA from 2 different locations. On the second attempt, the allocated STA is not destroyed creating a kernel memory leak. This is happening because sta_info_insert_finish() does not call sta_info_free() the second time when the STA already exists (returns -EEXIST). Note that the caller sta_info_insert_rcu() assumes STA is destroyed upon errors. Same fix is applied to -ENOMEM. Signed-off-by: Ahmed Zaki <anzaki@gmail.com> Link: https://lore.kernel.org/r/20211002145329.3125293-1-anzaki@gmail.com [change the error path label to use the existing code] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-11-26mac80211: set up the fwd_skb->dev for mesh forwardingXing Song
Mesh forwarding requires that the fwd_skb->dev is set up for TX handling, otherwise the following warning will be generated, so set it up for the pending frames. [ 72.835674 ] WARNING: CPU: 0 PID: 1193 at __skb_flow_dissect+0x284/0x1298 [ 72.842379 ] Modules linked in: ksmbd pppoe ppp_async l2tp_ppp ... [ 72.962020 ] CPU: 0 PID: 1193 Comm: kworker/u5:1 Tainted: P S 5.4.137 #0 [ 72.969938 ] Hardware name: MT7622_MT7531 RFB (DT) [ 72.974659 ] Workqueue: napi_workq napi_workfn [ 72.979025 ] pstate: 60000005 (nZCv daif -PAN -UAO) [ 72.983822 ] pc : __skb_flow_dissect+0x284/0x1298 [ 72.988444 ] lr : __skb_flow_dissect+0x54/0x1298 [ 72.992977 ] sp : ffffffc010c738c0 [ 72.996293 ] x29: ffffffc010c738c0 x28: 0000000000000000 [ 73.001615 ] x27: 000000000000ffc2 x26: ffffff800c2eb818 [ 73.006937 ] x25: ffffffc010a987c8 x24: 00000000000000ce [ 73.012259 ] x23: ffffffc010c73a28 x22: ffffffc010a99c60 [ 73.017581 ] x21: 000000000000ffc2 x20: ffffff80094da800 [ 73.022903 ] x19: 0000000000000000 x18: 0000000000000014 [ 73.028226 ] x17: 00000000084d16af x16: 00000000d1fc0bab [ 73.033548 ] x15: 00000000715f6034 x14: 000000009dbdd301 [ 73.038870 ] x13: 00000000ea4dcbc3 x12: 0000000000000040 [ 73.044192 ] x11: 000000000eb00ff0 x10: 0000000000000000 [ 73.049513 ] x9 : 000000000eb00073 x8 : 0000000000000088 [ 73.054834 ] x7 : 0000000000000000 x6 : 0000000000000001 [ 73.060155 ] x5 : 0000000000000000 x4 : 0000000000000000 [ 73.065476 ] x3 : ffffffc010a98000 x2 : 0000000000000000 [ 73.070797 ] x1 : 0000000000000000 x0 : 0000000000000000 [ 73.076120 ] Call trace: [ 73.078572 ] __skb_flow_dissect+0x284/0x1298 [ 73.082846 ] __skb_get_hash+0x7c/0x228 [ 73.086629 ] ieee80211_txq_may_transmit+0x7fc/0x17b8 [mac80211] [ 73.092564 ] ieee80211_tx_prepare_skb+0x20c/0x268 [mac80211] [ 73.098238 ] ieee80211_tx_pending+0x144/0x330 [mac80211] [ 73.103560 ] tasklet_action_common.isra.16+0xb4/0x158 [ 73.108618 ] tasklet_action+0x2c/0x38 [ 73.112286 ] __do_softirq+0x168/0x3b0 [ 73.115954 ] do_softirq.part.15+0x88/0x98 [ 73.119969 ] __local_bh_enable_ip+0xb0/0xb8 [ 73.124156 ] napi_workfn+0x58/0x90 [ 73.127565 ] process_one_work+0x20c/0x478 [ 73.131579 ] worker_thread+0x50/0x4f0 [ 73.135249 ] kthread+0x124/0x128 [ 73.138484 ] ret_from_fork+0x10/0x1c Signed-off-by: Xing Song <xing.song@mediatek.com> Tested-By: Frank Wunderlich <frank-w@public-files.de> Link: https://lore.kernel.org/r/20211123033123.2684-1-xing.song@mediatek.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-11-26mac80211: fix regression in SSN handling of addba txFelix Fietkau
Some drivers that do their own sequence number allocation (e.g. ath9k) rely on being able to modify params->ssn on starting tx ampdu sessions. This was broken by a change that modified it to use sta->tid_seq[tid] instead. Cc: stable@vger.kernel.org Fixes: 31d8bb4e07f8 ("mac80211: agg-tx: refactor sending addba") Reported-by: Eneas U de Queiroz <cotequeiroz@gmail.com> Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20211124094024.43222-1-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-11-26mac80211: fix rate control for retransmitted framesFelix Fietkau
Since retransmission clears info->control, rate control needs to be called again, otherwise the driver might crash due to invalid rates. Cc: stable@vger.kernel.org # 5.14+ Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi> Reported-by: Robert W <rwbugreport@lost-in-the-void.net> Fixes: 03c3911d2d67 ("mac80211: call ieee80211_tx_h_rate_ctrl() when dequeue") Signed-off-by: Felix Fietkau <nbd@nbd.name> Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi> Link: https://lore.kernel.org/r/20211122204323.9787-1-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-11-26mac80211: track only QoS data frames for admission controlJohannes Berg
For admission control, obviously all of that only works for QoS data frames, otherwise we cannot even access the QoS field in the header. Syzbot reported (see below) an uninitialized value here due to a status of a non-QoS nullfunc packet, which isn't even long enough to contain the QoS header. Fix this to only do anything for QoS data packets. Reported-by: syzbot+614e82b88a1a4973e534@syzkaller.appspotmail.com Fixes: 02219b3abca5 ("mac80211: add WMM admission control support") Link: https://lore.kernel.org/r/20211122124737.dad29e65902a.Ieb04587afacb27c14e0de93ec1bfbefb238cc2a0@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-11-26mac80211: fix TCP performance on mesh interfaceMaxime Bizon
sta is NULL for mesh point (resolved later), so sk pacing parameters were not applied. Signed-off-by: Maxime Bizon <mbizon@freebox.fr> Link: https://lore.kernel.org/r/66f51659416ac35d6b11a313bd3ffe8b8a43dd55.camel@freebox.fr Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-11-25sctp: make the raise timer more simple and accurateXin Long
Currently, the probe timer is reused as the raise timer when PLPMTUD is in the Search Complete state. raise_count was introduced to count how many times the probe timer has timed out. When raise_count reaches to 30, the raise timer handler will be triggered. During the whole processing above, the timer keeps timing out every probe_ interval. It is a waste for the Search Complete state, as the raise timer only needs to time out after 30 * probe_interval. Since the raise timer and probe timer are never used at the same time, it is no need to keep probe timer 'alive' in the Search Complete state. This patch to introduce sctp_transport_reset_raise_timer() to start the timer as the raise timer when entering the Search Complete state. When entering the other states, sctp_transport_reset_probe_timer() will still be called to reset the timer to the probe timer. raise_count can be removed from sctp_transport as no need to count probe timer timeout for raise timer timeout. last_rtx_chunks can be removed as sctp_transport_reset_probe_timer() can be called in the place where asoc rtx_data_chunks is changed. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Link: https://lore.kernel.org/r/edb0e48988ea85997488478b705b11ddc1ba724a.1637781974.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-25tipc: delete the unlikely branch in tipc_aead_encryptXin Long
When a skb comes to tipc_aead_encrypt(), it's always linear. The unlikely check 'skb_cloned(skb) && tailen <= skb_tailroom(skb)' can completely be taken care of in skb_cow_data() by the code in branch "if (!skb_has_frag_list())". Also, remove the 'TODO:' annotation, as the pages in skbs are not writable, see more on commit 3cf4375a0904 ("tipc: do not write skb_shinfo frags when doing decrytion"). Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Link: https://lore.kernel.org/r/47a478da0b6095b76e3cbe7a75cbd25d9da1df9a.1637773872.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-25tls: fix replacing proto_opsJakub Kicinski
We replace proto_ops whenever TLS is configured for RX. But our replacement also overrides sendpage_locked, which will crash unless TX is also configured. Similarly we plug both of those in for TLS_HW (NIC crypto offload) even tho TLS_HW has a completely different implementation for TX. Last but not least we always plug in something based on inet_stream_ops even though a few of the callbacks differ for IPv6 (getname, release, bind). Use a callback building method similar to what we do for struct proto. Fixes: c46234ebb4d1 ("tls: RX path for ktls") Fixes: d4ffb02dee2f ("net/tls: enable sk_msg redirect to tls socket egress") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-25tls: splice_read: fix accessing pre-processed recordsJakub Kicinski
recvmsg() will put peek()ed and partially read records onto the rx_list. splice_read() needs to consult that list otherwise it may miss data. Align with recvmsg() and also put partially-read records onto rx_list. tls_sw_advance_skb() is pretty pointless now and will be removed in net-next. Fixes: 692d7b5d1f91 ("tls: Fix recvmsg() to be able to peek across multiple records") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-25tls: splice_read: fix record type checkJakub Kicinski
We don't support splicing control records. TLS 1.3 changes moved the record type check into the decrypt if(). The skb may already be decrypted and still be an alert. Note that decrypt_skb_update() is idempotent and updates ctx->decrypted so the if() is pointless. Reorder the check for decryption errors with the content type check while touching them. This part is not really a bug, because if decryption failed in TLS 1.3 content type will be DATA, and for TLS 1.2 it will be correct. Nevertheless its strange to touch output before checking if the function has failed. Fixes: fedf201e1296 ("net: tls: Refactor control message handling on recv") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-25Bluetooth: Limit duration of Remote Name ResolveArchie Pusaka
When doing remote name request, we cannot scan. In the normal case it's OK since we can expect it to finish within a short amount of time. However, there is a possibility to scan lots of devices that (1) requires Remote Name Resolve (2) is unresponsive to Remote Name Resolve When this happens, we are stuck to do Remote Name Resolve until all is done before continue scanning. This patch adds a time limit to stop us spending too long on remote name request. Signed-off-by: Archie Pusaka <apusaka@chromium.org> Reviewed-by: Miao-chen Chou <mcchou@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2021-11-25Bluetooth: Send device found event on name resolve failureArchie Pusaka
Introducing NAME_REQUEST_FAILED flag that will be sent together with device found event on name resolve failure. This will provide the userspace with an information so it can decide not to resolve the name for these devices in the future. Signed-off-by: Archie Pusaka <apusaka@chromium.org> Reviewed-by: Miao-chen Chou <mcchou@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2021-11-25Bluetooth: HCI: Fix definition of hci_rp_delete_stored_link_keyLuiz Augusto von Dentz
num_keys is actually 2 octects not 1: BLUETOOTH CORE SPECIFICATION Version 5.3 | Vol 4, Part E page 1989: Num_Keys_Deleted: Size: 2 octets 0xXXXX Number of Link Keys Deleted Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2021-11-25Bluetooth: HCI: Fix definition of hci_rp_read_stored_link_keyLuiz Augusto von Dentz
Both max_num_keys and num_key are 2 octects: BLUETOOTH CORE SPECIFICATION Version 5.3 | Vol 4, Part E page 1985: Max_Num_Keys: Size: 2 octets Range: 0x0000 to 0xFFFF Num_Keys_Read: Size: 2 octets Range: 0x0000 to 0xFFFF Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2021-11-25tty: remove file from tty_ldisc_ops::ioctl and compat_ioctlJiri Slaby
After the previous patches, noone needs 'file' parameter in neither ioctl hook from tty_ldisc_ops. So remove 'file' from both of them. Cc: Marcel Holtmann <marcel@holtmann.org> Cc: Johan Hedberg <johan.hedberg@gmail.com> Cc: Luiz Augusto von Dentz <luiz.dentz@gmail.com> Cc: Wolfgang Grandegger <wg@grandegger.com> Cc: Marc Kleine-Budde <mkl@pengutronix.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Andreas Koensgen <ajk@comnets.uni-bremen.de> Cc: Paul Mackerras <paulus@samba.org> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> [NFC] Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Link: https://lore.kernel.org/r/20211122094529.24171-1-jslaby@suse.cz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-24net/smc: Fix loop in smc_listenGuo DaXing
The kernel_listen function in smc_listen will fail when all the available ports are occupied. At this point smc->clcsock->sk->sk_data_ready has been changed to smc_clcsock_data_ready. When we call smc_listen again, now both smc->clcsock->sk->sk_data_ready and smc->clcsk_data_ready point to the smc_clcsock_data_ready function. The smc_clcsock_data_ready() function calls lsmc->clcsk_data_ready which now points to itself resulting in an infinite loop. This patch restores smc->clcsock->sk->sk_data_ready with the old value. Fixes: a60a2b1e0af1 ("net/smc: reduce active tcp_listen workers") Signed-off-by: Guo DaXing <guodaxing@huawei.com> Acked-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-24net/smc: Fix NULL pointer dereferencing in smc_vlan_by_tcpsk()Karsten Graul
Coverity reports a possible NULL dereferencing problem: in smc_vlan_by_tcpsk(): 6. returned_null: netdev_lower_get_next returns NULL (checked 29 out of 30 times). 7. var_assigned: Assigning: ndev = NULL return value from netdev_lower_get_next. 1623 ndev = (struct net_device *)netdev_lower_get_next(ndev, &lower); CID 1468509 (#1 of 1): Dereference null return value (NULL_RETURNS) 8. dereference: Dereferencing a pointer that might be NULL ndev when calling is_vlan_dev. 1624 if (is_vlan_dev(ndev)) { Remove the manual implementation and use netdev_walk_all_lower_dev() to iterate over the lower devices. While on it remove an obsolete function parameter comment. Fixes: cb9d43f67754 ("net/smc: determine vlan_id of stacked net_device") Suggested-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-24net-ipv6: changes to ->tclass (via IPV6_TCLASS) should sk_dst_reset()Maciej Żenczykowski
This is to match ipv4 behaviour, see __ip_sock_set_tos() implementation. Technically for ipv6 this might not be required because normally we do not allow tclass to influence routing, yet the cli tooling does support it: lpk11:~# ip -6 rule add pref 5 tos 45 lookup 5 lpk11:~# ip -6 rule 5: from all tos 0x45 lookup 5 and in general dscp/tclass based routing does make sense. We already have cases where dscp can affect vlan priority and/or transmit queue (especially on wifi). So let's just make things match. Easier to reason about and no harm. Cc: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Signed-off-by: Maciej Żenczykowski <maze@google.com> Link: https://lore.kernel.org/r/20211123223208.1117871-1-zenczykowski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-24net-ipv6: do not allow IPV6_TCLASS to muck with tcp's ECNMaciej Żenczykowski
This is to match ipv4 behaviour, see __ip_sock_set_tos() implementation at ipv4/ip_sockglue.c:579 void __ip_sock_set_tos(struct sock *sk, int val) { if (sk->sk_type == SOCK_STREAM) { val &= ~INET_ECN_MASK; val |= inet_sk(sk)->tos & INET_ECN_MASK; } if (inet_sk(sk)->tos != val) { inet_sk(sk)->tos = val; sk->sk_priority = rt_tos2priority(val); sk_dst_reset(sk); } } Cc: Neal Cardwell <ncardwell@google.com> Signed-off-by: Maciej Żenczykowski <maze@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20211123223154.1117794-1-zenczykowski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-24net: allow SO_MARK with CAP_NET_RAWMaciej Żenczykowski
A CAP_NET_RAW capable process can already spoof (on transmit) anything it desires via raw packet sockets... There is no good reason to not allow it to also be able to play routing tricks on packets from its own normal sockets. There is a desire to be able to use SO_MARK for routing table selection (via ip rule fwmark) from within a user process without having to run it as root. Granting it CAP_NET_RAW is much less dangerous than CAP_NET_ADMIN (CAP_NET_RAW doesn't permit persistent state change, while CAP_NET_ADMIN does - by for example allowing the reconfiguration of the routing tables and/or bringing up/down devices). Let's keep CAP_NET_ADMIN for persistent state changes, while using CAP_NET_RAW for non-configuration related stuff. Signed-off-by: Maciej Żenczykowski <maze@google.com> Link: https://lore.kernel.org/r/20211123203715.193413-1-zenczykowski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-24net: allow CAP_NET_RAW to setsockopt SO_PRIORITYMaciej Żenczykowski
CAP_NET_ADMIN is and should continue to be about configuring the system as a whole, not about configuring per-socket or per-packet parameters. Sending and receiving raw packets is what CAP_NET_RAW is all about. It can already send packets with any VLAN tag, and any IPv4 TOS mark, and any IPv6 TCLASS mark, simply by virtue of building such a raw packet. Not to mention using any protocol and source/ /destination ip address/port tuple. These are the fields that networking gear uses to prioritize packets. Hence, a CAP_NET_RAW process is already capable of affecting traffic prioritization after it hits the wire. This change makes it capable of affecting traffic prioritization even in the host at the nic and before that in the queueing disciplines (provided skb->priority is actually being used for prioritization, and not the TOS/TCLASS field) Hence it makes sense to allow a CAP_NET_RAW process to set the priority of sockets and thus packets it sends. Signed-off-by: Maciej Żenczykowski <maze@google.com> Link: https://lore.kernel.org/r/20211123203702.193221-1-zenczykowski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-24tcp_cubic: fix spurious Hystart ACK train detections for not-cwnd-limited flowsEric Dumazet
While testing BIG TCP patch series, I was expecting that TCP_RR workloads with 80KB requests/answers would send one 80KB TSO packet, then being received as a single GRO packet. It turns out this was not happening, and the root cause was that cubic Hystart ACK train was triggering after a few (2 or 3) rounds of RPC. Hystart was wrongly setting CWND/SSTHRESH to 30, while my RPC needed a budget of ~20 segments. Ideally these TCP_RR flows should not exit slow start. Cubic Hystart should reset itself at each round, instead of assuming every TCP flow is a bulk one. Note that even after this patch, Hystart can still trigger, depending on scheduling artifacts, but at a higher CWND/SSTHRESH threshold, keeping optimal TSO packet sizes. Tested: ip link set dev eth0 gro_ipv6_max_size 131072 gso_ipv6_max_size 131072 nstat -n; netperf -H ... -t TCP_RR -l 5 -- -r 80000,80000 -K cubic; nstat|egrep "Ip6InReceives|Hystart|Ip6OutRequests" Before: 8605 Ip6InReceives 87541 0.0 Ip6OutRequests 129496 0.0 TcpExtTCPHystartTrainDetect 1 0.0 TcpExtTCPHystartTrainCwnd 30 0.0 After: 8760 Ip6InReceives 88514 0.0 Ip6OutRequests 87975 0.0 Fixes: ae27e98a5152 ("[TCP] CUBIC v2.3") Co-developed-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Yuchung Cheng <ycheng@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Link: https://lore.kernel.org/r/20211123202535.1843771-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-24net: bridge: Allow base 16 inputs in sysfsIdo Schimmel
Cited commit converted simple_strtoul() to kstrtoul() as suggested by the former's documentation. However, it also forced all the inputs to be decimal resulting in user space breakage. Fix by setting the base to '0' so that the base is automatically detected. Before: # ip link add name br0 type bridge vlan_filtering 1 # echo "0x88a8" > /sys/class/net/br0/bridge/vlan_protocol bash: echo: write error: Invalid argument After: # ip link add name br0 type bridge vlan_filtering 1 # echo "0x88a8" > /sys/class/net/br0/bridge/vlan_protocol # echo $? 0 Fixes: 520fbdf7fb19 ("net/bridge: replace simple_strtoul to kstrtol") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Link: https://lore.kernel.org/r/20211124101122.3321496-1-idosch@idosch.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-24gro: remove rcu_read_lock/rcu_read_unlock from gro_complete handlersEric Dumazet
All gro_complete() handlers are called from napi_gro_complete() while rcu_read_lock() has been called. There is no point stacking more rcu_read_lock() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-24gro: remove rcu_read_lock/rcu_read_unlock from gro_receive handlersEric Dumazet
All gro_receive() handlers are called from dev_gro_receive() while rcu_read_lock() has been called. There is no point stacking more rcu_read_lock() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-24Bluetooth: refactor malicious adv data checkBrian Gix
Check for out-of-bound read was being performed at the end of while num_reports loop, and would fill journal with false positives. Added check to beginning of loop processing so that it doesn't get checked after ptr has been advanced. Signed-off-by: Brian Gix <brian.gix@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2021-11-24net/ncsi : Add payload to be 32-bit aligned to fix dropped packetsKumar Thangavel
Update NC-SI command handler (both standard and OEM) to take into account of payload paddings in allocating skb (in case of payload size is not 32-bit aligned). The checksum field follows payload field, without taking payload padding into account can cause checksum being truncated, leading to dropped packets. Fixes: fb4ee67529ff ("net/ncsi: Add NCSI OEM command support") Signed-off-by: Kumar Thangavel <thangavel.k@hcl.com> Acked-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-23dccp: Inline dccp_listen_start().Kuniyuki Iwashima
This patch inlines dccp_listen_start() and removes a stale comment in inet_dccp_listen() so that it looks like inet_listen(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Reviewed-by: Richard Sailer <richard_siegfried@systemli.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-23dccp/tcp: Remove an unused argument in inet_csk_listen_start().Kuniyuki Iwashima
The commit 1295e2cf3065 ("inet: minor optimization for backlog setting in listen(2)") added change so that sk_max_ack_backlog is initialised earlier in inet_dccp_listen() and inet_listen(). Since then, we no longer use backlog in inet_csk_listen_start(), so let's remove it. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Acked-by: Yafang Shao <laoar.shao@gmail.com> Reviewed-by: Richard Sailer <richard_siegfried@systemli.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-23net: remove .ndo_change_proto_downJakub Kicinski
.ndo_change_proto_down was added seemingly to enable out-of-tree implementations. Over 2.5yrs later we still have no real users upstream. Hardwire the generic implementation for now, we can revert once real users materialize. (rocker is a test vehicle, not a user.) We need to drop the optimization on the sysfs side, because unlike ndos priv_flags will be changed at runtime, so we'd need READ_ONCE/WRITE_ONCE everywhere.. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-23net/smc: Ensure the active closing peer first closes clcsockTony Lu
The side that actively closed socket, it's clcsock doesn't enter TIME_WAIT state, but the passive side does it. It should show the same behavior as TCP sockets. Consider this, when client actively closes the socket, the clcsock in server enters TIME_WAIT state, which means the address is occupied and won't be reused before TIME_WAIT dismissing. If we restarted server, the service would be unavailable for a long time. To solve this issue, shutdown the clcsock in [A], perform the TCP active close progress first, before the passive closed side closing it. So that the actively closed side enters TIME_WAIT, not the passive one. Client | Server close() // client actively close | smc_release() | smc_close_active() // PEERCLOSEWAIT1 | smc_close_final() // abort or closed = 1| smc_cdc_get_slot_and_msg_send() | [A] | |smc_cdc_msg_recv_action() // ACTIVE | queue_work(smc_close_wq, &conn->close_work) | smc_close_passive_work() // PROCESSABORT or APPCLOSEWAIT1 | smc_close_passive_abort_received() // only in abort | |close() // server recv zero, close | smc_release() // PROCESSABORT or APPCLOSEWAIT1 | smc_close_active() | smc_close_abort() or smc_close_final() // CLOSED | smc_cdc_get_slot_and_msg_send() // abort or closed = 1 smc_cdc_msg_recv_action() | smc_clcsock_release() queue_work(smc_close_wq, &conn->close_work) | sock_release(tcp) // actively close clc, enter TIME_WAIT smc_close_passive_work() // PEERCLOSEWAIT1 | smc_conn_free() smc_close_passive_abort_received() // CLOSED| smc_conn_free() | smc_clcsock_release() | sock_release(tcp) // passive close clc | Link: https://www.spinics.net/lists/netdev/msg780407.html Fixes: b38d732477e4 ("smc: socket closing and linkgroup cleanup") Signed-off-by: Tony Lu <tonylu@linux.alibaba.com> Reviewed-by: Wen Gu <guwen@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-23net/smc: Clean up local struct sock variablesTony Lu
There remains some variables to replace with local struct sock. So clean them up all. Fixes: 3163c5071f25 ("net/smc: use local struct sock variables consistently") Signed-off-by: Tony Lu <tonylu@linux.alibaba.com> Reviewed-by: Wen Gu <guwen@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-23net: nexthop: fix null pointer dereference when IPv6 is not enabledNikolay Aleksandrov
When we try to add an IPv6 nexthop and IPv6 is not enabled (!CONFIG_IPV6) we'll hit a NULL pointer dereference[1] in the error path of nh_create_ipv6() due to calling ipv6_stub->fib6_nh_release. The bug has been present since the beginning of IPv6 nexthop gateway support. Commit 1aefd3de7bc6 ("ipv6: Add fib6_nh_init and release to stubs") tells us that only fib6_nh_init has a dummy stub because fib6_nh_release should not be called if fib6_nh_init returns an error, but the commit below added a call to ipv6_stub->fib6_nh_release in its error path. To fix it return the dummy stub's -EAFNOSUPPORT error directly without calling ipv6_stub->fib6_nh_release in nh_create_ipv6()'s error path. [1] Output is a bit truncated, but it clearly shows the error. BUG: kernel NULL pointer dereference, address: 000000000000000000 #PF: supervisor instruction fetch in kernel modede #PF: error_code(0x0010) - not-present pagege PGD 0 P4D 0 Oops: 0010 [#1] PREEMPT SMP NOPTI CPU: 4 PID: 638 Comm: ip Kdump: loaded Not tainted 5.16.0-rc1+ #446 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-4.fc34 04/01/2014 RIP: 0010:0x0 Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. RSP: 0018:ffff888109f5b8f0 EFLAGS: 00010286^Ac RAX: 0000000000000000 RBX: ffff888109f5ba28 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8881008a2860 RBP: ffff888109f5b9d8 R08: 0000000000000000 R09: 0000000000000000 R10: ffff888109f5b978 R11: ffff888109f5b948 R12: 00000000ffffff9f R13: ffff8881008a2a80 R14: ffff8881008a2860 R15: ffff8881008a2840 FS: 00007f98de70f100(0000) GS:ffff88822bf00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 0000000100efc000 CR4: 00000000000006e0 Call Trace: <TASK> nh_create_ipv6+0xed/0x10c rtm_new_nexthop+0x6d7/0x13f3 ? check_preemption_disabled+0x3d/0xf2 ? lock_is_held_type+0xbe/0xfd rtnetlink_rcv_msg+0x23f/0x26a ? check_preemption_disabled+0x3d/0xf2 ? rtnl_calcit.isra.0+0x147/0x147 netlink_rcv_skb+0x61/0xb2 netlink_unicast+0x100/0x187 netlink_sendmsg+0x37f/0x3a0 ? netlink_unicast+0x187/0x187 sock_sendmsg_nosec+0x67/0x9b ____sys_sendmsg+0x19d/0x1f9 ? copy_msghdr_from_user+0x4c/0x5e ? rcu_read_lock_any_held+0x2a/0x78 ___sys_sendmsg+0x6c/0x8c ? asm_sysvec_apic_timer_interrupt+0x12/0x20 ? lockdep_hardirqs_on+0xd9/0x102 ? sockfd_lookup_light+0x69/0x99 __sys_sendmsg+0x50/0x6e do_syscall_64+0xcb/0xf2 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f98dea28914 Code: 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b5 0f 1f 80 00 00 00 00 48 8d 05 e9 5d 0c 00 8b 00 85 c0 75 13 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 41 54 41 89 d4 55 48 89 f5 53 RSP: 002b:00007fff859f5e68 EFLAGS: 00000246 ORIG_RAX: 000000000000002e2e RAX: ffffffffffffffda RBX: 00000000619cb810 RCX: 00007f98dea28914 RDX: 0000000000000000 RSI: 00007fff859f5ed0 RDI: 0000000000000003 RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000008 R10: fffffffffffffce6 R11: 0000000000000246 R12: 0000000000000001 R13: 000055c0097ae520 R14: 000055c0097957fd R15: 00007fff859f63a0 </TASK> Modules linked in: bridge stp llc bonding virtio_net Cc: stable@vger.kernel.org Fixes: 53010f991a9f ("nexthop: Add support for IPv6 gateways") Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-23xfrm: fix policy lookup for ipv6 gre packetsGhalem Boudour
On egress side, xfrm lookup is called from __gre6_xmit() with the fl6_gre_key field not initialized leading to policies selectors check failure. Consequently, gre packets are sent without encryption. On ingress side, INET6_PROTO_NOPOLICY was set, thus packets were not checked against xfrm policies. Like for egress side, fl6_gre_key should be correctly set, this is now done in decode_session6(). Fixes: c12b395a4664 ("gre: Support GRE over IPv6") Cc: stable@vger.kernel.org Signed-off-by: Ghalem Boudour <ghalem.boudour@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2021-11-22lsm: security_task_getsecid_subj() -> security_current_getsecid_subj()Paul Moore
The security_task_getsecid_subj() LSM hook invites misuse by allowing callers to specify a task even though the hook is only safe when the current task is referenced. Fix this by removing the task_struct argument to the hook, requiring LSM implementations to use the current task. While we are changing the hook declaration we also rename the function to security_current_getsecid_subj() in an effort to reinforce that the hook captures the subjective credentials of the current task and not an arbitrary task on the system. Reviewed-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-11-22SUNRPC: use different lock keys for INET6 and LOCALNeilBrown
xprtsock.c reclassifies sock locks based on the protocol. However there are 3 protocols and only 2 classification keys. The same key is used for both INET6 and LOCAL. This causes lockdep complaints. The complaints started since Commit ea9afca88bbe ("SUNRPC: Replace use of socket sk_callback_lock with sock_lock") which resulted in the sock locks beings used more. So add another key, and renumber them slightly. Fixes: ea9afca88bbe ("SUNRPC: Replace use of socket sk_callback_lock with sock_lock") Fixes: 176e21ee2ec8 ("SUNRPC: Support for RPC over AF_LOCAL transports") Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>