summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-11-20brcmfmac: remove monitor interface when detachingRafał Miłecki
This fixes a minor WARNING in the cfg80211: [ 130.658034] ------------[ cut here ]------------ [ 130.662805] WARNING: CPU: 1 PID: 610 at net/wireless/core.c:954 wiphy_unregister+0xb4/0x198 [cfg80211] Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-11-20brcmfmac: disable PCIe interrupts before bus resetRafał Miłecki
Keeping interrupts on could result in brcmfmac freeing some resources and then IRQ handlers trying to use them. That was obviously a straight path for crashing a kernel. Example: CPU0 CPU1 ---- ---- brcmf_pcie_reset brcmf_pcie_bus_console_read brcmf_detach ... brcmf_fweh_detach brcmf_proto_detach brcmf_pcie_isr_thread ... brcmf_proto_msgbuf_rx_trigger ... drvr->proto->pd brcmf_pcie_release_irq [ 363.789218] Unable to handle kernel NULL pointer dereference at virtual address 00000038 [ 363.797339] pgd = c0004000 [ 363.800050] [00000038] *pgd=00000000 [ 363.803635] Internal error: Oops: 17 [#1] SMP ARM (...) [ 364.029209] Backtrace: [ 364.031725] [<bf243838>] (brcmf_proto_msgbuf_rx_trigger [brcmfmac]) from [<bf2471dc>] (brcmf_pcie_isr_thread+0x228/0x274 [brcmfmac]) [ 364.043662] r7:00000001 r6:c8ca0000 r5:00010000 r4:c7b4f800 Fixes: 4684997d9eea ("brcmfmac: reset PCIe bus on a firmware crash") Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-11-20rtw88: allows to enable/disable HCI link PS mechanismYan-Hsuan Chuang
Different interfaces have its own link-related power save mechanism. Such as PCI can enter L1 state based on the traffic on the link, and sometimes driver needs to enable/disable it to avoid some issues, like throughput degrade when PCI trying to enter L1 state even if driver is having heavy traffic. For now, rtw88 only supports PCIE chips, and they just need to disable ASPM L1 when driver is not in power save mode, such as IPS and LPS. Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-11-20rtw88: pci: enable CLKREQ function if host supports itYan-Hsuan Chuang
By Realtek's design, there are two HW modules associated for CLKREQ, one is responsible to follow the PCIE host settings, and another is to actually working on it. But the module that is actually working on it is default disabled, and driver should enable that module if host and device have successfully sync'ed with each other. The module is default disabled because sometimes the host does not support it, and if there is any incorrect settings (ex. CLKREQ# is not Bi-Direction), device can be lost and disconnected to the host. So driver should first check after host and device are sync'ed, and the host does support the function and set it in configuration space, then driver can turn on the HW module to working on it. Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com> Reviewed-by: Chris Chiu <chiu@endlessm.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-11-20rtw88: pci: use for loop instead of while loop for DBI/MDIOYan-Hsuan Chuang
Use a for loop to polling DBI/MDIO read/write flags to avoid infinite loop happens Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com> Reviewed-by: Chris Chiu <chiu@endlessm.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-11-20rtw88: pci: use macros to access PCI DBI/MDIO registersYan-Hsuan Chuang
Add some register and bit macros to access DBI/MDIO register. This should not change the logic. Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-11-20qtnfmac: process HE capabilities requestsMikhail Karpenko
Pass HE interface type data requests between firmware and driver. Signed-off-by: Mikhail Karpenko <mkarpenko@quantenna.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-11-20qtnfmac: add TLV for extension IEsMikhail Karpenko
Extension information elements have additional field for ID. This commit adds TLV for such elements and a structure for interface HE capabilities communication with firmware. Signed-off-by: Mikhail Karpenko <mkarpenko@quantenna.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-11-20qtnfmac: signal that all packets coming from device are already floodedIgor Mitsyanko
Firmware floods all packets that need to be flooded (multicast, broadcast, unknown unicast) as required. Tell kernel bridge subsystem it does not need to flood packet itself by marking each incoming frame with skb->offload_fwd_mark flag. Signed-off-by: Igor Mitsyanko <igor.mitsyanko.os@quantenna.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-11-20qtnfmac: advertise netdev port parent IDIgor Mitsyanko
Use MAC address of the first active radio as a unique device ID. Signed-off-by: Igor Mitsyanko <igor.mitsyanko.os@quantenna.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-11-20qtnfmac: add interface ID to each packetIgor Mitsyanko
Add interface ID information to the tail of each transmitted packet so that firmware can know to which interface the packet belongs to. This is only needed if device supports HW switch capability. Signed-off-by: Igor Mitsyanko <igor.mitsyanko.os@quantenna.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-11-20qtnfmac: track broadcast domain of each interfaceIgor Mitsyanko
If firmware reports that it supports hardware switch capabilities, driver needs to track and notify device whenever broadcast domain of a particular network device changes (ie. whenever it's upper master device changes). Firmware needs a unique ID to tell broadcast domains from each other which is an opaque number otherwise. For that purpose we can use netspace:ifidx pair to uniquely identify each broadcast domain: - if netdev is not part of a bridge, then use it's own ifidx as a broadcast domain ID - if netdev is part of a bridge, then use bridge netdev ifidx as broadcast domain ID Firmware makes sure that packets are only forwarded between interfaces marked with the same broadcast domain ID. Signed-off-by: Igor Mitsyanko <igor.mitsyanko.os@quantenna.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-11-20qtnfmac: remove VIF in firmware in case of errorIgor Mitsyanko
Currently in case of error when registering network device with the kernel, we won't properly cleanup VIF state in firmware due to DEL_VIF command will not be send to wifi card. Make sure it does. Signed-off-by: Igor Mitsyanko <igor.mitsyanko.os@quantenna.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-11-20rtlwifi: set proper udelay within rf_serial_readPing-Ke Shih
Since read RF register is an indirect access that hardware needs time to accomplish read action, but there's no ready bit, so delay is required to guarantee the read value is correct. After investigating internal documents, these delays are reduced as proper values. Reported-by: Lucas Stach <dev@lynxeye.de> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-11-20rtlwifi: rf_lock use non-irqsave spin_lockPing-Ke Shih
rf_lock is used to protect RF register access, but they will not called from interrupt context, so *_irqsave version isn't necessary. Then, these delays don't affect IRQ services. The old code holds spin_lock_irqsave() that will be detected a long delay as below: kworker/-276 4d... 0us : _raw_spin_lock_irqsave kworker/-276 4d... 0us : rtl8723_phy_rf_serial_read <-rtl8723de_phy_set_rf_reg kworker/-276 4d... 1us : rtl8723_phy_query_bb_reg <-rtl8723_phy_rf_serial_read kworker/-276 4d... 3us : rtl8723_phy_set_bb_reg <-rtl8723_phy_rf_serial_read kworker/-276 4d... 4us : __const_udelay <-rtl8723_phy_rf_serial_read kworker/-276 4d... 4us!: delay_mwaitx <-rtl8723_phy_rf_serial_read kworker/-276 4d... 1004us : rtl8723_phy_set_bb_reg <-rtl8723_phy_rf_serial_read [...] Reported-by: Lucas Stach <dev@lynxeye.de> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-11-20ipw2x00: remove set but not used variable 'force_update'zhengbin
Fixes gcc '-Wunused-but-set-variable' warning: drivers/net/wireless/intel/ipw2x00/ipw2100.c: In function shim__set_security: drivers/net/wireless/intel/ipw2x00/ipw2100.c:5582:9: warning: variable force_update set but not used [-Wunused-but-set-variable] Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-11-20ipw2x00: remove set but not used variable 'reason'zhengbin
Fixes gcc '-Wunused-but-set-variable' warning: drivers/net/wireless/intel/ipw2x00/ipw2200.c: In function ipw_wx_set_mlme: drivers/net/wireless/intel/ipw2x00/ipw2200.c:6805:9: warning: variable reason set but not used [-Wunused-but-set-variable] Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-11-20brcmfmac: remove set but not used variable 'mpnum','nsp','nmp'zhengbin
Fixes gcc '-Wunused-but-set-variable' warning: drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c: In function brcmf_chip_dmp_get_regaddr: drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c:790:5: warning: variable mpnum set but not used [-Wunused-but-set-variable] drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c: In function brcmf_chip_dmp_erom_scan: drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c:866:10: warning: variable nsp set but not used [-Wunused-but-set-variable] drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c: In function brcmf_chip_dmp_erom_scan: drivers/net/wireless/broadcom/brcm80211/brcmfmac/chip.c:866:5: warning: variable nmp set but not used [-Wunused-but-set-variable] Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-11-19mdio_bus: Fix init if CONFIG_RESET_CONTROLLER=nGeert Uytterhoeven
Commit 1d4639567d97 ("mdio_bus: Fix PTR_ERR applied after initialization to constant") accidentally changed a check from -ENOTSUPP to -ENOSYS, causing failures if reset controller support is not enabled. E.g. on r7s72100/rskrza1: sh-eth e8203000.ethernet: MDIO init failed: -524 sh-eth: probe of e8203000.ethernet failed with error -524 Seen on r8a7740/armadillo, r7s72100/rskrza1, and r7s9210/rza2mevb. Fixes: 1d4639567d97 ("mdio_bus: Fix PTR_ERR applied after initialization to constant") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: YueHaibing <yuehaibing@huawei.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-19Revert "mdio_bus: fix mdio_register_device when RESET_CONTROLLER is disabled"David S. Miller
This reverts commit 075e238d12c21c8bde700d21fb48be7a3aa80194. Going to go with Geert's fix instead, which also has a correct Fixes tag. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-19net: hns3: fix a wrong reset interrupt status maskHuazhong Tan
According to hardware user manual, bits5~7 in register HCLGE_MISC_VECTOR_INT_STS means reset interrupts status, but HCLGE_RESET_INT_M is defined as bits0~2 now. So it will make hclge_reset_err_handle() read the wrong reset interrupt status. This patch fixes this wrong bit mask. Fixes: 2336f19d7892 ("net: hns3: check reset interrupt status when reset fails") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-19net: fec: fix clock count mis-matchChuhong Yuan
pm_runtime_put_autosuspend in probe will call runtime suspend to disable clks automatically if CONFIG_PM is defined. (If CONFIG_PM is not defined, its implementation will be empty, then runtime suspend will not be called.) Therefore, we can call pm_runtime_get_sync to runtime resume it first to enable clks, which matches the runtime suspend. (Only when CONFIG_PM is defined, otherwise pm_runtime_get_sync will also be empty, then runtime resume will not be called.) Then it is fine to disable clks without causing clock count mis-match. Fixes: c43eab3eddb4 ("net: fec: add missed clk_disable_unprepare in remove") Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Acked-by: Fugang Duan <fugang.duan@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-19net/sched: act_pedit: fix WARN() in the traffic pathDavide Caratti
when configuring act_pedit rules, the number of keys is validated only on addition of a new entry. This is not sufficient to avoid hitting a WARN() in the traffic path: for example, it is possible to replace a valid entry with a new one having 0 extended keys, thus causing splats in dmesg like: pedit BUG: index 42 WARNING: CPU: 2 PID: 4054 at net/sched/act_pedit.c:410 tcf_pedit_act+0xc84/0x1200 [act_pedit] [...] RIP: 0010:tcf_pedit_act+0xc84/0x1200 [act_pedit] Code: 89 fa 48 c1 ea 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e ac 00 00 00 48 8b 44 24 10 48 c7 c7 a0 c4 e4 c0 8b 70 18 e8 1c 30 95 ea <0f> 0b e9 a0 fa ff ff e8 00 03 f5 ea e9 14 f4 ff ff 48 89 58 40 e9 RSP: 0018:ffff888077c9f320 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffac2983a2 RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff888053927bec RBP: dffffc0000000000 R08: ffffed100a726209 R09: ffffed100a726209 R10: 0000000000000001 R11: ffffed100a726208 R12: ffff88804beea780 R13: ffff888079a77400 R14: ffff88804beea780 R15: ffff888027ab2000 FS: 00007fdeec9bd740(0000) GS:ffff888053900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffdb3dfd000 CR3: 000000004adb4006 CR4: 00000000001606e0 Call Trace: tcf_action_exec+0x105/0x3f0 tcf_classify+0xf2/0x410 __dev_queue_xmit+0xcbf/0x2ae0 ip_finish_output2+0x711/0x1fb0 ip_output+0x1bf/0x4b0 ip_send_skb+0x37/0xa0 raw_sendmsg+0x180c/0x2430 sock_sendmsg+0xdb/0x110 __sys_sendto+0x257/0x2b0 __x64_sys_sendto+0xdd/0x1b0 do_syscall_64+0xa5/0x4e0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fdeeb72e993 Code: 48 8b 0d e0 74 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 0d d6 2c 00 00 75 13 49 89 ca b8 2c 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 4b cc 00 00 48 89 04 24 RSP: 002b:00007ffdb3de8a18 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 000055c81972b700 RCX: 00007fdeeb72e993 RDX: 0000000000000040 RSI: 000055c81972b700 RDI: 0000000000000003 RBP: 00007ffdb3dea130 R08: 000055c819728510 R09: 0000000000000010 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000040 R13: 000055c81972b6c0 R14: 000055c81972969c R15: 0000000000000080 Fix this moving the check on 'nkeys' earlier in tcf_pedit_init(), so that attempts to install rules having 0 keys are always rejected with -EINVAL. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-19net: phylink: fix link mode modification in PHY modeRussell King
Modifying the link settings via phylink_ethtool_ksettings_set() and phylink_ethtool_set_pauseparam() didn't always work as intended for PHY based setups, as calling phylink_mac_config() would result in the unresolved configuration being committed to the MAC, rather than the configuration with the speed and duplex setting. This would work fine if the update caused the link to renegotiate, but if no settings have changed, phylib won't trigger a renegotiation cycle, and the MAC will be left incorrectly configured. Avoid calling phylink_mac_config() unless we are using an inband mode in phylink_ethtool_ksettings_set(), and use phy_set_asym_pause() as introduced in 4.20 to set the PHY settings in phylink_ethtool_set_pauseparam(). Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-19net: phylink: update documentation on create and destroyRussell King
Update the documentation on phylink's create and destroy functions to explicitly state that the rtnl lock must not be held while calling these. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-19bpf: Make array_map_mmap staticYueHaibing
Fix sparse warning: kernel/bpf/arraymap.c:481:5: warning: symbol 'array_map_mmap' was not declared. Should it be static? Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191119142113.15388-1-yuehaibing@huawei.com
2019-11-19selftests/bpf: Enforce no-ALU32 for test_progs-no_alu32Andrii Nakryiko
With the most recent Clang, alu32 is enabled by default if -mcpu=probe or -mcpu=v3 is specified. Use a separate build rule with -mcpu=v2 to enforce no ALU32 mode. Suggested-by: Yonghong Song <yhs@fb.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20191120002510.4130605-1-andriin@fb.com
2019-11-19r8169: disable TSO on a single version of RTL8168c to fix performanceCorinna Vinschen
During performance testing, I found that one of my r8169 NICs suffered a major performance loss, a 8168c model. Running netperf's TCP_STREAM test didn't return the expected throughput of > 900 Mb/s, but rather only about 22 Mb/s. Strange enough, running the TCP_MAERTS and UDP_STREAM tests all returned with throughput > 900 Mb/s, as did TCP_STREAM with the other r8169 NICs I can test (either one of 8169s, 8168e, 8168f). Bisecting turned up commit 93681cd7d94f83903cb3f0f95433d10c28a7e9a5, "r8169: enable HW csum and TSO" as the culprit. I added my 8168c version, RTL_GIGA_MAC_VER_22, to the code special-casing the 8168evl as per the patch below. This fixed the performance problem for me. Fixes: 93681cd7d94f ("r8169: enable HW csum and TSO") Signed-off-by: Corinna Vinschen <vinschen@redhat.com> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-19MAINTAINERS: forcedeth: Change Zhu Yanjun's email addressZhu Yanjun
I prefer to use my personal email address for kernel related work. Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Acked-by: Rain River <rain.1986.08.12@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-19cxgb4: remove unneeded semicolon for switch blockRahul Lakkireddy
Semicolon is not required at the end of switch block. So, remove it. Addresses coccinelle warning: drivers/net/ethernet/chelsio/cxgb4/sge.c:2260:2-3: Unneeded semicolon Fixes: 4846d5330daf ("cxgb4: add Tx and Rx path for ETHOFLD traffic") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-19taprio: don't reject same mqprio settingsIvan Khoronzhuk
The taprio qdisc allows to set mqprio setting but only once. In case if mqprio settings are provided next time the error is returned as it's not allowed to change traffic class mapping in-flignt and that is normal. But if configuration is absolutely the same - no need to return error. It allows to provide same command couple times, changing only base time for instance, or changing only scheds maps, but leaving mqprio setting w/o modification. It more corresponds the message: "Changing the traffic mapping of a running schedule is not supported", so reject mqprio if it's really changed. Also corrected TC_BITMASK + 1 for consistency, as proposed. Fixes: a3d43c0d56f1 ("taprio: Add support adding an admin schedule") Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Tested-by: Vladimir Oltean <olteanv@gmail.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-19net: dsa: felix: Fix CPU port assignment when not last portVladimir Oltean
On the NXP LS1028A, there are 2 Ethernet links between the Felix switch and the ENETC: - eno2 <-> swp4, at 2.5G - eno3 <-> swp5, at 1G Only one of the above Ethernet port pairs can act as a DSA link for tagging. When adding initial support for the driver, it was tested only on the 1G eno3 <-> swp5 interface, due to the necessity of using PHYLIB initially (which treats fixed-link interfaces as emulated C22 PHYs, so it doesn't support fixed-link speeds higher than 1G). After making PHYLINK work, it appears that swp4 still can't act as CPU port. So it looks like ocelot_set_cpu_port was being called for swp4, but then it was called again for swp5, overwriting the CPU port assigned in the DT. It appears that when you call dsa_upstream_port for a port that is not defined in the device tree (such as swp5 when using swp4 as CPU port), its dp->cpu_dp pointer is not initialized by dsa_tree_setup_default_cpu, and this trips up the following condition in dsa_upstream_port: if (!cpu_dp) return port; So the moral of the story is: don't call dsa_upstream_port for a port that is not defined in the device tree, and therefore its dsa_port structure is not completely initialized (ds->num_ports is still 6). Fixes: 56051948773e ("net: dsa: ocelot: add driver for Felix switch family") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-19net/tls: enable sk_msg redirect to tls socket egressWillem de Bruijn
Bring back tls_sw_sendpage_locked. sk_msg redirection into a socket with TLS_TX takes the following path: tcp_bpf_sendmsg_redir tcp_bpf_push_locked tcp_bpf_push kernel_sendpage_locked sock->ops->sendpage_locked Also update the flags test in tls_sw_sendpage_locked to allow flag MSG_NO_SHARED_FRAGS. bpf_tcp_sendmsg sets this. Link: https://lore.kernel.org/netdev/CA+FuTSdaAawmZ2N8nfDDKu3XLpXBbMtcCT0q4FntDD2gn8ASUw@mail.gmail.com/T/#t Link: https://github.com/wdebruij/kerneltools/commits/icept.2 Fixes: 0608c69c9a80 ("bpf: sk_msg, sock{map|hash} redirect through ULP") Fixes: f3de19af0f5b ("Revert \"net/tls: remove unused function tls_sw_sendpage_locked\"") Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-19libbpf: Fix call relocation offset calculation bugAndrii Nakryiko
When relocating subprogram call, libbpf doesn't take into account relo->text_off, which comes from symbol's value. This generally works fine for subprograms implemented as static functions, but breaks for global functions. Taking a simplified test_pkt_access.c as an example: __attribute__ ((noinline)) static int test_pkt_access_subprog1(volatile struct __sk_buff *skb) { return skb->len * 2; } __attribute__ ((noinline)) static int test_pkt_access_subprog2(int val, volatile struct __sk_buff *skb) { return skb->len + val; } SEC("classifier/test_pkt_access") int test_pkt_access(struct __sk_buff *skb) { if (test_pkt_access_subprog1(skb) != skb->len * 2) return TC_ACT_SHOT; if (test_pkt_access_subprog2(2, skb) != skb->len + 2) return TC_ACT_SHOT; return TC_ACT_UNSPEC; } When compiled, we get two relocations, pointing to '.text' symbol. .text has st_value set to 0 (it points to the beginning of .text section): 0000000000000008 000000050000000a R_BPF_64_32 0000000000000000 .text 0000000000000040 000000050000000a R_BPF_64_32 0000000000000000 .text test_pkt_access_subprog1 and test_pkt_access_subprog2 offsets (targets of two calls) are encoded within call instruction's imm32 part as -1 and 2, respectively: 0000000000000000 test_pkt_access_subprog1: 0: 61 10 00 00 00 00 00 00 r0 = *(u32 *)(r1 + 0) 1: 64 00 00 00 01 00 00 00 w0 <<= 1 2: 95 00 00 00 00 00 00 00 exit 0000000000000018 test_pkt_access_subprog2: 3: 61 10 00 00 00 00 00 00 r0 = *(u32 *)(r1 + 0) 4: 04 00 00 00 02 00 00 00 w0 += 2 5: 95 00 00 00 00 00 00 00 exit 0000000000000000 test_pkt_access: 0: bf 16 00 00 00 00 00 00 r6 = r1 ===> 1: 85 10 00 00 ff ff ff ff call -1 2: bc 01 00 00 00 00 00 00 w1 = w0 3: b4 00 00 00 02 00 00 00 w0 = 2 4: 61 62 00 00 00 00 00 00 r2 = *(u32 *)(r6 + 0) 5: 64 02 00 00 01 00 00 00 w2 <<= 1 6: 5e 21 08 00 00 00 00 00 if w1 != w2 goto +8 <LBB0_3> 7: bf 61 00 00 00 00 00 00 r1 = r6 ===> 8: 85 10 00 00 02 00 00 00 call 2 9: bc 01 00 00 00 00 00 00 w1 = w0 10: 61 62 00 00 00 00 00 00 r2 = *(u32 *)(r6 + 0) 11: 04 02 00 00 02 00 00 00 w2 += 2 12: b4 00 00 00 ff ff ff ff w0 = -1 13: 1e 21 01 00 00 00 00 00 if w1 == w2 goto +1 <LBB0_3> 14: b4 00 00 00 02 00 00 00 w0 = 2 0000000000000078 LBB0_3: 15: 95 00 00 00 00 00 00 00 exit Now, if we compile example with global functions, the setup changes. Relocations are now against specifically test_pkt_access_subprog1 and test_pkt_access_subprog2 symbols, with test_pkt_access_subprog2 pointing 24 bytes into its respective section (.text), i.e., 3 instructions in: 0000000000000008 000000070000000a R_BPF_64_32 0000000000000000 test_pkt_access_subprog1 0000000000000048 000000080000000a R_BPF_64_32 0000000000000018 test_pkt_access_subprog2 Calls instructions now encode offsets relative to function symbols and are both set ot -1: 0000000000000000 test_pkt_access_subprog1: 0: 61 10 00 00 00 00 00 00 r0 = *(u32 *)(r1 + 0) 1: 64 00 00 00 01 00 00 00 w0 <<= 1 2: 95 00 00 00 00 00 00 00 exit 0000000000000018 test_pkt_access_subprog2: 3: 61 20 00 00 00 00 00 00 r0 = *(u32 *)(r2 + 0) 4: 0c 10 00 00 00 00 00 00 w0 += w1 5: 95 00 00 00 00 00 00 00 exit 0000000000000000 test_pkt_access: 0: bf 16 00 00 00 00 00 00 r6 = r1 ===> 1: 85 10 00 00 ff ff ff ff call -1 2: bc 01 00 00 00 00 00 00 w1 = w0 3: b4 00 00 00 02 00 00 00 w0 = 2 4: 61 62 00 00 00 00 00 00 r2 = *(u32 *)(r6 + 0) 5: 64 02 00 00 01 00 00 00 w2 <<= 1 6: 5e 21 09 00 00 00 00 00 if w1 != w2 goto +9 <LBB2_3> 7: b4 01 00 00 02 00 00 00 w1 = 2 8: bf 62 00 00 00 00 00 00 r2 = r6 ===> 9: 85 10 00 00 ff ff ff ff call -1 10: bc 01 00 00 00 00 00 00 w1 = w0 11: 61 62 00 00 00 00 00 00 r2 = *(u32 *)(r6 + 0) 12: 04 02 00 00 02 00 00 00 w2 += 2 13: b4 00 00 00 ff ff ff ff w0 = -1 14: 1e 21 01 00 00 00 00 00 if w1 == w2 goto +1 <LBB2_3> 15: b4 00 00 00 02 00 00 00 w0 = 2 0000000000000080 LBB2_3: 16: 95 00 00 00 00 00 00 00 exit Thus the right formula to calculate target call offset after relocation should take into account relocation's target symbol value (offset within section), call instruction's imm32 offset, and (subtracting, to get relative instruction offset) instruction index of call instruction itself. All that is shifted by number of instructions in main program, given all sub-programs are copied over after main program. Convert few selftests relying on bpf-to-bpf calls to use global functions instead of static ones. Fixes: 48cca7e44f9f ("libbpf: add support for bpf_call") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191119224447.3781271-1-andriin@fb.com
2019-11-19afs: Fix missing timeout resetDavid Howells
In afs_wait_for_call_to_complete(), rather than immediately aborting an operation if a signal occurs, the code attempts to wait for it to complete, using a schedule timeout of 2*RTT (or min 2 jiffies) and a check that we're still receiving relevant packets from the server before we consider aborting the call. We may even ping the server to check on the status of the call. However, there's a missing timeout reset in the event that we do actually get a packet to process, such that if we then get a couple of short stalls, we then time out when progress is actually being made. Fix this by resetting the timeout any time we get something to process. If it's the failure of the call then the call state will get changed and we'll exit the loop shortly thereafter. A symptom of this is data fetches and stores failing with EINTR when they really shouldn't. Fixes: bc5e3a546d55 ("rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-19net-af_xdp: Use correct number of channels from ethtoolLuigi Rizzo
Drivers use different fields to report the number of channels, so take the maximum of all data channels (rx, tx, combined) when determining the size of the xsk map. The current code used only 'combined' which was set to 0 in some drivers e.g. mlx4. Tested: compiled and run xdpsock -q 3 -r -S on mlx4 Signed-off-by: Luigi Rizzo <lrizzo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20191119001951.92930-1-lrizzo@google.com
2019-11-19gve: fix dma sync bug where not all pages syncedAdi Suresh
The previous commit had a bug where the last page in the memory range could not be synced. This change fixes the behavior so that all the required pages are synced. Fixes: 9cfeeb576d49 ("gve: Fixes DMA synchronization") Signed-off-by: Adi Suresh <adisuresh@google.com> Reviewed-by: Catherine Sullivan <csully@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-19drm/i915: make pool objects read-onlyMatthew Auld
For our current users we don't expect pool objects to be writable from the gpu. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Fixes: 4f7af1948abc ("drm/i915: Support ro ppgtt mapped cmdparser shadow buffers") Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191119150154.18249-1-matthew.auld@intel.com (cherry picked from commit d18580b08b92ec4105eb0ede2d676e8b1f5a66c3) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-11-19mdio_bus: Fix init if CONFIG_RESET_CONTROLLER=nGeert Uytterhoeven
Commit 1d4639567d97 ("mdio_bus: Fix PTR_ERR applied after initialization to constant") accidentally changed a check from -ENOTSUPP to -ENOSYS, causing failures if reset controller support is not enabled. E.g. on r7s72100/rskrza1: sh-eth e8203000.ethernet: MDIO init failed: -524 sh-eth: probe of e8203000.ethernet failed with error -524 Seen on r8a7740/armadillo, r7s72100/rskrza1, and r7s9210/rza2mevb. Fixes: 1d4639567d97 ("mdio_bus: Fix PTR_ERR applied after initialization to constant") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: YueHaibing <yuehaibing@huawei.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-19nbd:fix memory leak in nbd_get_socket()Sun Ke
Before returning NULL, put the sock first. Cc: stable@vger.kernel.org Fixes: cf1b2326b734 ("nbd: verify socket is supported during setup") Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Sun Ke <sunke32@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-11-18Merge branch 'remove-jited-size-limits'Alexei Starovoitov
Ilya Leoshkevich says: ==================== This patch series introduces usage of relative long jumps and loads in order to lift 64/512k size limits on JITed BPF programs on s390. Patch 1 introduces long relative branches. Patch 2 changes the way literal pool is arranged in order to be compatible with long relative loads. Patch 3 changes the way literal pool base register is loaded for large programs. Patch 4 replaces regular loads with long relative loads where they are totally superior. Patch 5 introduces long relative loads as an alternative way to load constants in large programs. Regular loads are kept and still used for small programs. Patch 6 removes the size limit check. ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-11-18s390/bpf: Remove JITed image size limitationsIlya Leoshkevich
Now that jump and long displacement ranges are no longer a problem, remove the limit on JITed image size. In practice it's still limited by 2G, but with verifier allowing "only" 1M instructions, it's not an issue. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191118180340.68373-7-iii@linux.ibm.com
2019-11-18s390/bpf: Use lg(f)rl when long displacement cannot be usedIlya Leoshkevich
If literal pool grows past 524287 mark, it's no longer possible to use long displacement to reference literal pool entries. In JIT setting maintaining multiple literal pool registers is next to impossible, since we operate on one instruction at a time. Therefore, fall back to loading literal pool entry using PC-relative addressing, and then using a register-register form of the following machine instruction. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191118180340.68373-6-iii@linux.ibm.com
2019-11-18s390/bpf: Use lgrl instead of lg where possibleIlya Leoshkevich
lg and lgrl have the same performance characteristics, but the former requires a base register and is subject to long displacement range limits, while the latter does not. Therefore, lgrl is totally superior to lg and should be used instead whenever possible. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191118180340.68373-5-iii@linux.ibm.com
2019-11-18s390/bpf: Load literal pool register using larlIlya Leoshkevich
Currently literal pool register is loaded using basr, which makes it point not to the beginning of the literal pool, but rather to the next instruction. In case JITed code is larger than 512k, this renders literal pool register absolutely useless due to long displacement range restrictions. The solution is to use larl to make literal pool register point to the very beginning of the literal pool. This makes it always possible to address 512k worth of literal pool entries using long displacement. However, for short programs, in which the entire literal pool is covered by basr-generated base, it is still beneficial to use basr, since it is 4 bytes shorter than larl. Detect situations when basr-generated base does not cover the entire literal pool, and in such cases use larl instead. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191118180340.68373-4-iii@linux.ibm.com
2019-11-18s390/bpf: Align literal pool entriesIlya Leoshkevich
When literal pool size exceeds 512k, it's no longer possible to reference all the entries in it using a single base register and long displacement. Therefore, PC-relative lgfrl and lgrl instructions need to be used. Unfortunately, they require their arguments to be aligned to 4- and 8-byte boundaries respectively. This generates certain overhead due to necessary padding bytes. Grouping 4- and 8-byte entries together reduces the maximum overhead to 6 bytes (2 for aligning 4-byte entries and 4 for aligning 8-byte entries). While in theory it is possible to detect whether or not alignment is needed by comparing the literal pool size with 512k, in practice this leads to having two ways of emitting constants, making the code more complicated. Prefer code simplicity over trivial size saving, and always group and align literal pool entries. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191118180340.68373-3-iii@linux.ibm.com
2019-11-18s390/bpf: Use relative long branchesIlya Leoshkevich
Currently maximum JITed code size is limited to 64k, because JIT can emit only relative short branches, whose range is limited by 64k in both directions. Teach JIT to use relative long branches. There are no compare+branch relative long instructions, so using relative long branches consumes more space due to having to having to emit an explicit comparison instruction. Therefore do this only when relative short branch is not enough. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191118180340.68373-2-iii@linux.ibm.com
2019-11-18bpf: Fix memory leak on object 'data'Colin Ian King
The error return path on when bpf_fentry_test* tests fail does not kfree 'data'. Fix this by adding the missing kfree. Addresses-Coverity: ("Resource leak") Fixes: faeb2dce084a ("bpf: Add kernel test functions for fentry testing") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191118114059.37287-1-colin.king@canonical.com
2019-11-18mdio_bus: fix mdio_register_device when RESET_CONTROLLER is disabledMarek Behún
When CONFIG_RESET_CONTROLLER is disabled, the devm_reset_control_get_exclusive function returns -ENOTSUPP. This is not handled in subsequent check and then the mdio device fails to probe. When CONFIG_RESET_CONTROLLER is enabled, its code checks in OF for reset device, and since it is not present, returns -ENOENT. -ENOENT is handled. Add -ENOTSUPP also. This happened to me when upgrading kernel on Turris Omnia. You either have to enable CONFIG_RESET_CONTROLLER or use this patch. Signed-off-by: Marek Behún <marek.behun@nic.cz> Fixes: 71dd6c0dff51b ("net: phy: add support for reset-controller") Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18net/ipv4: fix sysctl max for fib_multipath_hash_policyMarcelo Ricardo Leitner
Commit eec4844fae7c ("proc/sysctl: add shared variables for range check") did: - .extra2 = &two, + .extra2 = SYSCTL_ONE, here, which doesn't seem to be intentional, given the changelog. This patch restores it to the previous, as the value of 2 still makes sense (used in fib_multipath_hash()). Fixes: eec4844fae7c ("proc/sysctl: add shared variables for range check") Cc: Matteo Croce <mcroce@redhat.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>