summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-12-05ch_ktls: fix build warning for ipv4-only configArnd Bergmann
When CONFIG_IPV6 is disabled, clang complains that a variable is uninitialized for non-IPv4 data: drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c:1046:6: error: variable 'cntrl1' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized] if (tx_info->ip_family == AF_INET) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c:1059:2: note: uninitialized use occurs here cntrl1 |= T6_TXPKT_ETHHDR_LEN_V(maclen - ETH_HLEN) | ^~~~~~ Replace the preprocessor conditional with the corresponding C version, and make the ipv4 case unconditional in this configuration to improve readability and avoid the warning. Fixes: 86716b51d14f ("ch_ktls: Update cheksum information") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20201203222641.964234-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-05Merge tag 'powerpc-5.10-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Some more powerpc fixes for 5.10: - Three commits fixing possible missed TLB invalidations for multi-threaded processes when CPUs are hotplugged in and out. - A fix for a host crash triggerable by host userspace (qemu) in KVM on Power9. - A fix for a host crash in machine check handling when running HPT guests on a HPT host. - One commit fixing potential missed TLB invalidations when using the hash MMU on Power9 or later. - A regression fix for machines with CPUs on node 0 but no memory. Thanks to Aneesh Kumar K.V, Cédric Le Goater, Greg Kurz, Milan Mohanty, Milton Miller, Nicholas Piggin, Paul Mackerras, and Srikar Dronamraju" * tag 'powerpc-5.10-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s/powernv: Fix memory corruption when saving SLB entries on MCE KVM: PPC: Book3S HV: XIVE: Fix vCPU id sanity check powerpc/numa: Fix a regression on memoryless node 0 powerpc/64s: Trim offlined CPUs from mm_cpumasks kernel/cpu: add arch override for clear_tasks_mm_cpumask() mm handling powerpc/64s/pseries: Fix hash tlbiel_all_isa300 for guest kernels powerpc/64s: Fix hash ISA v3.0 TLBIEL instruction generation
2020-12-05Merge tag '5.10-rc6-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs fixes from Steve French: "Three smb3 fixes (two for stable) fixing - a null pointer issue in a DFS error path - a problem with excessive padding when mounted with "idsfromsid" causing owner fields to get corrupted - a more recent problem with compounded reparse point query found in testing to the Linux kernel server" * tag '5.10-rc6-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6: cifs: refactor create_sd_buf() and and avoid corrupting the buffer cifs: add NULL check for ses->tcon_ipc smb3: set COMPOUND_FID to FileID field of subsequent compound request
2020-12-05Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Four small fixes in two drivers. The mpt3sas fixes are all problems with timeout under unusual conditions, and the storvsc is a missed incoming packet validation and a missed error return" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: mpt3sas: Increase IOCInit request timeout to 30s scsi: mpt3sas: Fix ioctl timeout scsi: storvsc: Validate length of incoming packet in storvsc_on_channel_callback() scsi: storvsc: Fix error return in storvsc_probe()
2020-12-05Merge tag 'for-5.10/dm-fixes-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull fix for device mapper fixes from Mike Snitzer: "Apologies for the glaring bug I introduced with my previous pull request! Fix incorrect branching at top of blk_max_size_offset()" * tag 'for-5.10/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: block: fix incorrect branching in blk_max_size_offset()
2020-12-05clocksource/drivers/arm_arch_timer: Correct fault programming of ↵Keqian Zhu
CNTKCTL_EL1.EVNTI ARM virtual counter supports event stream, it can only trigger an event when the trigger bit (the value of CNTKCTL_EL1.EVNTI) of CNTVCT_EL0 changes, so the actual period of event stream is 2^(cntkctl_evnti + 1). For example, when the trigger bit is 0, then virtual counter trigger an event for every two cycles. While we're at it, rework the way we compute the trigger bit position by making it more obvious that when bits [n:n-1] are both set (with n being the most significant bit), we pick bit (n + 1). Fixes: 037f637767a8 ("drivers: clocksource: add support for ARM architected timer event stream") Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Acked-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20201204073126.6920-3-zhukeqian1@huawei.com
2020-12-05clocksource/drivers/arm_arch_timer: Use stable count reader in erratum sneKeqian Zhu
In commit 0ea415390cd3 ("clocksource/arm_arch_timer: Use arch_timer_read_counter to access stable counters"), we separate stable and normal count reader to omit unnecessary overhead on systems that have no timer erratum. However, in erratum_set_next_event_tval_generic(), count reader becomes normal reader. This converts it to stable reader. Fixes: 0ea415390cd3 ("clocksource/arm_arch_timer: Use arch_timer_read_counter to access stable counters") Acked-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20201204073126.6920-2-zhukeqian1@huawei.com
2020-12-05clocksource/drivers/dw_apb_timer_of: Add error handling if no clock availableDinh Nguyen
commit ("b0fc70ce1f02 arm64: berlin: Select DW_APB_TIMER_OF") added the support for the dw_apb_timer into the arm64 defconfig. However, for some platforms like the Intel Stratix10 and Agilex, the clock manager doesn't get loaded until after the timer driver get loaded. Thus, the driver hits the panic "No clock nor clock-frequency property for" because it cannot properly get the clock. This patch adds the error handling needed for the timer driver so that the kernel can continue booting instead of just hitting the panic. Signed-off-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20201205105223.208604-1-dinguyen@kernel.org
2020-12-05drm/msm: add IOMMU_SUPPORT dependencyArnd Bergmann
The iommu pgtable support is only available when IOMMU support is built into the kernel: WARNING: unmet direct dependencies detected for IOMMU_IO_PGTABLE Depends on [n]: IOMMU_SUPPORT [=n] Selected by [y]: - DRM_MSM [=y] && HAS_IOMEM [=y] && DRM [=y] && (ARCH_QCOM [=y] || SOC_IMX5 || ARM && COMPILE_TEST [=y]) && OF [=y] && COMMON_CLK [=y] && MMU [=y] && (QCOM_OCMEM [=y] || QCOM_OCMEM [=y]=n) Fix the dependency accordingly. There is no need for depending on CONFIG_MMU any more, as that is implied by the iommu support. Fixes: b145c6e65eb0 ("drm/msm: Add support to create a local pagetable") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-12-05drm/msm: a5xx: Make preemption reset case reentrantMarijn Suijten
nr_rings is reset to 1, but when this function is called for a second (and third!) time nr_rings > 1 is false, thus the else case is entered to set up a buffer for the RPTR shadow and consequently written to RB_RPTR_ADDR, hanging platforms without WHERE_AM_I firmware support. Restructure the condition in such a way that shadow buffer setup only ever happens when has_whereami is true; otherwise preemption is only finalized when the number of ring buffers has not been reset to 1 yet. Fixes: 8907afb476ac ("drm/msm: Allow a5xx to mark the RPTR shadow as privileged") Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org> Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@somainline.org> Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-12-05drm/msm/dpu: enable DSPP support on SM8[12]50Dmitry Baryshkov
Add support for color correction sub block on SM8150 and SM8250. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
2020-12-05driver core: auxiliary bus: Fix auxiliary bus shutdown null auxdrv ptrDave Jiang
If the probe of the auxdrv failed, the device->driver is set to NULL. During kernel shutdown, the bus shutdown will call auxdrv->shutdown and cause an invalid ptr dereference. Add check to make sure device->driver is not NULL before we proceed. Fixes: 7de3697e9cbd ("Add auxiliary bus support") Cc: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/160710040926.1889434.8840329810698403478.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-05i2c: mlxbf: Fix the return check of devm_ioremap and ioremapWang Xiaojun
devm_ioremap and ioremap may return NULL which cannot be checked by IS_ERR. Signed-off-by: Wang Xiaojun <wangxiaojun11@huawei.com> Reported-by: Hulk Robot <hulkci@huawei.com> Acked-by: Khalil Blaiech <kblaiech@nvidia.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2020-12-05i2c: mlxbf: select CONFIG_I2C_SLAVEArnd Bergmann
If this is not enabled, the interfaces used in this driver do not work: drivers/i2c/busses/i2c-mlxbf.c:1888:3: error: implicit declaration of function 'i2c_slave_event' [-Werror,-Wimplicit-function-declaration] i2c_slave_event(slave, I2C_SLAVE_WRITE_REQUESTED, &value); ^ drivers/i2c/busses/i2c-mlxbf.c:1888:26: error: use of undeclared identifier 'I2C_SLAVE_WRITE_REQUESTED' i2c_slave_event(slave, I2C_SLAVE_WRITE_REQUESTED, &value); ^ drivers/i2c/busses/i2c-mlxbf.c:1890:32: error: use of undeclared identifier 'I2C_SLAVE_WRITE_RECEIVED' ret = i2c_slave_event(slave, I2C_SLAVE_WRITE_RECEIVED, ^ drivers/i2c/busses/i2c-mlxbf.c:1892:26: error: use of undeclared identifier 'I2C_SLAVE_STOP' i2c_slave_event(slave, I2C_SLAVE_STOP, &value); ^ Fixes: b5b5b32081cd ("i2c: mlxbf: I2C SMBus driver for Mellanox BlueField SoC") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Khalil Blaiech <kblaiech@nvidia.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2020-12-05dt-bindings: phy: Convert Broadcom SATA PHY to YAMLFlorian Fainelli
Update the Broadcom SATA PHY Device Tree binding to a YAML format. Suggested-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20201204193532.1934108-1-f.fainelli@gmail.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-12-05devicetree: phy: rockchip-emmc add output-tapdelay-selectChris Ruehl
Update the rockchip-emmc-phy.txt and add the u32 property 'output-tapdelay-select'. This allow to set the otapdlysec register. Tested with our customized rk3399 board to tune eMMC. Signed-off-by: Chris Ruehl <chris.ruehl@gtsys.com.hk> Link: https://lore.kernel.org/r/20201202082507.3536-3-chris.ruehl@gtsys.com.hk Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-12-05phy: rockchip-emmc: output tap delay dt propertyChris Ruehl
Update the rockchip-emmc phy to set the otapdlysec register with a dt property. This was mentioned from Brian Norris when he sent the path to set the default value in the driver. This patch add a dt property 'output-tapdelay-select' u32 and allow to set the 0x0-0xf. If not set in dts, the old default 0x4 applies. Tested with our customized rk3399 to tune the eMMC. Signed-off-by: Chris Ruehl <chris.ruehl@gtsys.com.hk> Link: https://lore.kernel.org/r/20201202082507.3536-2-chris.ruehl@gtsys.com.hk Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-12-05PHY: Ingenic: Add USB PHY driver using generic PHY framework.周琰杰 (Zhou Yanjie)
Used the generic PHY framework API to create the PHY, this driver supoorts USB OTG PHY used in JZ4770 SoC, JZ4775 SoC, JZ4780 SoC, X1000 SoC, X1830 SoC and X2000 SoC. Co-developed-by: 漆鹏振 (Qi Pengzhen) <aric.pzqi@ingenic.com> Signed-off-by: 漆鹏振 (Qi Pengzhen) <aric.pzqi@ingenic.com> Signed-off-by: 周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com> Tested-by: 周正 (Zhou Zheng) <sernia.zhou@foxmail.com> Tested-by: H. Nikolaus Schaller <hns@goldelico.com> Reviewed-by: Paul Cercueil <paul@crapouillou.net> Link: https://lore.kernel.org/r/20201116141906.11758-4-zhouyanjie@wanyeetech.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-12-05dt-bindings: USB: Add bindings for Ingenic JZ4775 and X2000.周琰杰 (Zhou Yanjie)
Move Ingenic USB PHY bindings from Documentation/devicetree/bindings/usb to Documentation/devicetree/bindings/phy, and add bindings for JZ4775 SoC and X2000 SoC. Signed-off-by: 周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Stephen Boyd <sboyd@kernel.org> Link: https://lore.kernel.org/r/20201116141906.11758-3-zhouyanjie@wanyeetech.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-12-05USB: PHY: JZ4770: Remove unnecessary function calls.周琰杰 (Zhou Yanjie)
Remove unnecessary "of_match_ptr()", because Ingenic SoCs all depend on Device Tree. Suggested-by: Paul Cercueil <paul@crapouillou.net> Signed-off-by: 周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com> Reviewed-by: Paul Cercueil <paul@crapouillou.net> Link: https://lore.kernel.org/r/20201116141906.11758-2-zhouyanjie@wanyeetech.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-12-05devicetree: phy: rockchip-emmc: pulldown propertyChris Ruehl
Update the documentation and add the bool property enable-strobe-pulldown used to enable the internal pull-down for the strobe line. Signed-off-by: Chris Ruehl <chris.ruehl@gtsys.com.hk> Link: https://lore.kernel.org/r/20201129054416.3980-3-chris.ruehl@gtsys.com.hk Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-12-05phy: rockchip: set pulldown for strobe line in dtsChris Ruehl
This patch add support to set the internal pulldown via dt property and allow simplify the board design for the trace from emmc-phy to the eMMC chipset. Default to not set the pull-down. This patch was inspired from the 4.4 tree of the Rockchip SDK, where it is enabled unconditional. The patch had been tested with our rk3399 customized board. Signed-off-by: Chris Ruehl <chris.ruehl@gtsys.com.hk> Link: https://lore.kernel.org/r/20201129054416.3980-2-chris.ruehl@gtsys.com.hk Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-12-05phy: renesas: rcar-gen3-usb2: disable runtime pm in case of failureWang Li
pm_runtime_enable() will decrease power disable depth. Thus a pairing increment is needed on the error handling path to keep it balanced. Fixes: 5d8042e95fd4 ("phy: rcar-gen3-usb2: Add support for r8a77470") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Li <wangli74@huawei.com> Link: https://lore.kernel.org/r/20201126024412.4046845-1-wangli74@huawei.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-12-05phy: mediatek: allow compile-testing the hdmi phyArnd Bergmann
Compile-testing the DRM_MEDIATEK_HDMI driver shows two missing dependencies, one results in a link failure: arm-linux-gnueabi-ld: drivers/phy/mediatek/phy-mtk-hdmi.o: in function `mtk_hdmi_phy_probe': phy-mtk-hdmi.c:(.text+0xd8): undefined reference to `__clk_get_name' arm-linux-gnueabi-ld: phy-mtk-hdmi.c:(.text+0x12c): undefined reference to `devm_clk_register' arm-linux-gnueabi-ld: phy-mtk-hdmi.c:(.text+0x250): undefined reference to `of_clk_add_provider' arm-linux-gnueabi-ld: phy-mtk-hdmi.c:(.text+0x298): undefined reference to `of_clk_src_simple_get' The other one is a harmless warning: WARNING: unmet direct dependencies detected for PHY_MTK_HDMI Depends on [n]: ARCH_MEDIATEK [=n] && OF [=y] Selected by [y]: - DRM_MEDIATEK_HDMI [=y] && HAS_IOMEM [=y] && DRM_MEDIATEK [=y] Fix these by adding dependencies on CONFIG_OF and CONFIG_COMMON_CLK. With that done, there is also no reason against adding CONFIG_COMPILE_TEST. Fixes: b28be59a2e26 ("phy: mediatek: Move mtk_hdmi_phy driver into drivers/phy/mediatek folder") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20201204135650.2744481-1-arnd@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-12-05soundwire: intel: fix another unused-function warningArnd Bergmann
Without CONFIG_PM, there is another warning about an unused function: drivers/soundwire/intel.c:530:12: error: 'intel_link_power_down' defined but not used [-Werror=unused-function] After a previous fix, the driver already uses both an #ifdef and a __maybe_unused annotation but still gets it wrong. Remove the ifdef and instead use __maybe_unused consistently to avoid the problem for good. Fixes: f046b2334083 ("soundwire: intel: fix intel_suspend/resume defined but not used warning") Fixes: ebf878eddbb4 ("soundwire: intel: add pm_runtime support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20201203230502.1480063-1-arnd@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org>
2020-12-05ALSA: hda/realtek: make bass spk volume adjustable on a yoga laptopHui Wang
This change could fix 2 issues on this machine: - the bass speaker's output volume can't be adjusted, that is because the bass speaker is routed to the DAC (Nid 0x6) which has no volume control. - after plugging a headset with vol+, vol- and pause buttons on it, press those buttons, nothing happens, this means those buttons don't work at all. This machine has alc287 codec, need to add the codec id to the disable/enable_headset_jack_key(), then the headset button could work. The quirk of ALC285_FIXUP_THINKPAD_HEADSET_JACK could fix both of these 2 issues. Cc: <stable@vger.kernel.org> Signed-off-by: Hui Wang <hui.wang@canonical.com> Link: https://lore.kernel.org/r/20201205051130.8122-1-hui.wang@canonical.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-12-04net/nfc/nci: Support NCI 2.x initial sequenceBongsu Jeon
implement the NCI 2.x initial sequence to support NCI 2.x NFCC. Since NCI 2.0, CORE_RESET and CORE_INIT sequence have been changed. If NFCEE supports NCI 2.x, then NCI 2.x initial sequence will work. In NCI 1.0, Initial sequence and payloads are as below: (DH) (NFCC) | -- CORE_RESET_CMD --> | | <-- CORE_RESET_RSP -- | | -- CORE_INIT_CMD --> | | <-- CORE_INIT_RSP -- | CORE_RESET_RSP payloads are Status, NCI version, Configuration Status. CORE_INIT_CMD payloads are empty. CORE_INIT_RSP payloads are Status, NFCC Features, Number of Supported RF Interfaces, Supported RF Interface, Max Logical Connections, Max Routing table Size, Max Control Packet Payload Size, Max Size for Large Parameters, Manufacturer ID, Manufacturer Specific Information. In NCI 2.0, Initial Sequence and Parameters are as below: (DH) (NFCC) | -- CORE_RESET_CMD --> | | <-- CORE_RESET_RSP -- | | <-- CORE_RESET_NTF -- | | -- CORE_INIT_CMD --> | | <-- CORE_INIT_RSP -- | CORE_RESET_RSP payloads are Status. CORE_RESET_NTF payloads are Reset Trigger, Configuration Status, NCI Version, Manufacturer ID, Manufacturer Specific Information Length, Manufacturer Specific Information. CORE_INIT_CMD payloads are Feature1, Feature2. CORE_INIT_RSP payloads are Status, NFCC Features, Max Logical Connections, Max Routing Table Size, Max Control Packet Payload Size, Max Data Packet Payload Size of the Static HCI Connection, Number of Credits of the Static HCI Connection, Max NFC-V RF Frame Size, Number of Supported RF Interfaces, Supported RF Interfaces. Signed-off-by: Bongsu Jeon <bongsu.jeon@samsung.com> Link: https://lore.kernel.org/r/20201202223147.3472-1-bongsu.jeon@samsung.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-04selftests: forwarding: Add MPLS L2VPN testGuillaume Nault
Connect hosts H1 and H2 using two intermediate encapsulation routers (LER1 and LER2). These routers encapsulate traffic from the hosts, including the original Ethernet header, into MPLS. Use ping to test reachability between H1 and H2. Signed-off-by: Guillaume Nault <gnault@redhat.com> Link: https://lore.kernel.org/r/625f5c1aafa3a8085f8d3e082d680a82e16ffbaa.1606918980.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-04net: bna: remove trailing semicolon in macro definitionTom Rix
The macro use will already have a semicolon. Clean up escaped newlines. Signed-off-by: Tom Rix <trix@redhat.com> Link: https://lore.kernel.org/r/20201202163622.3733506-1-trix@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-04tipc: support 128bit node identity for peer removingHoang Le
We add the support to remove a specific node down with 128bit node identifier, as an alternative to legacy 32-bit node address. example: $tipc peer remove identiy <1001002|16777777> Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au> Link: https://lore.kernel.org/r/20201203035045.4564-1-hoang.h.le@dektech.com.au Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-04mac80211: mesh: fix mesh_pathtbl_init() error pathEric Dumazet
If tbl_mpp can not be allocated, we call mesh_table_free(tbl_path) while tbl_path rhashtable has not yet been initialized, which causes panics. Simply factorize the rhashtable_init() call into mesh_table_alloc() WARNING: CPU: 1 PID: 8474 at kernel/workqueue.c:3040 __flush_work kernel/workqueue.c:3040 [inline] WARNING: CPU: 1 PID: 8474 at kernel/workqueue.c:3040 __cancel_work_timer+0x514/0x540 kernel/workqueue.c:3136 Modules linked in: CPU: 1 PID: 8474 Comm: syz-executor663 Not tainted 5.10.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__flush_work kernel/workqueue.c:3040 [inline] RIP: 0010:__cancel_work_timer+0x514/0x540 kernel/workqueue.c:3136 Code: 5d c3 e8 bf ae 29 00 0f 0b e9 f0 fd ff ff e8 b3 ae 29 00 0f 0b 43 80 3c 3e 00 0f 85 31 ff ff ff e9 34 ff ff ff e8 9c ae 29 00 <0f> 0b e9 dc fe ff ff 89 e9 80 e1 07 80 c1 03 38 c1 0f 8c 7d fd ff RSP: 0018:ffffc9000165f5a0 EFLAGS: 00010293 RAX: ffffffff814b7064 RBX: 0000000000000001 RCX: ffff888021c80000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff888024039ca0 R08: dffffc0000000000 R09: fffffbfff1dd3e64 R10: fffffbfff1dd3e64 R11: 0000000000000000 R12: 1ffff920002cbebd R13: ffff888024039c88 R14: 1ffff11004807391 R15: dffffc0000000000 FS: 0000000001347880(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000140 CR3: 000000002cc0a000 CR4: 00000000001506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: rhashtable_free_and_destroy+0x25/0x9c0 lib/rhashtable.c:1137 mesh_table_free net/mac80211/mesh_pathtbl.c:69 [inline] mesh_pathtbl_init+0x287/0x2e0 net/mac80211/mesh_pathtbl.c:785 ieee80211_mesh_init_sdata+0x2ee/0x530 net/mac80211/mesh.c:1591 ieee80211_setup_sdata+0x733/0xc40 net/mac80211/iface.c:1569 ieee80211_if_add+0xd5c/0x1cd0 net/mac80211/iface.c:1987 ieee80211_add_iface+0x59/0x130 net/mac80211/cfg.c:125 rdev_add_virtual_intf net/wireless/rdev-ops.h:45 [inline] nl80211_new_interface+0x563/0xb40 net/wireless/nl80211.c:3855 genl_family_rcv_msg_doit net/netlink/genetlink.c:739 [inline] genl_family_rcv_msg net/netlink/genetlink.c:783 [inline] genl_rcv_msg+0xe4e/0x1280 net/netlink/genetlink.c:800 netlink_rcv_skb+0x190/0x3a0 net/netlink/af_netlink.c:2494 genl_rcv+0x24/0x40 net/netlink/genetlink.c:811 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline] netlink_unicast+0x780/0x930 net/netlink/af_netlink.c:1330 netlink_sendmsg+0x9a8/0xd40 net/netlink/af_netlink.c:1919 sock_sendmsg_nosec net/socket.c:651 [inline] sock_sendmsg net/socket.c:671 [inline] ____sys_sendmsg+0x519/0x800 net/socket.c:2353 ___sys_sendmsg net/socket.c:2407 [inline] __sys_sendmsg+0x2b1/0x360 net/socket.c:2440 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 60854fd94573 ("mac80211: mesh: convert path table to rhashtable") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Link: https://lore.kernel.org/r/20201204162428.2583119-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-04[SECURITY] fix namespaced fscaps when !CONFIG_SECURITYSerge Hallyn
Namespaced file capabilities were introduced in 8db6c34f1dbc . When userspace reads an xattr for a namespaced capability, a virtualized representation of it is returned if the caller is in a user namespace owned by the capability's owning rootid. The function which performs this virtualization was not hooked up if CONFIG_SECURITY=n. Therefore in that case the original xattr was shown instead of the virtualized one. To test this using libcap-bin (*1), $ v=$(mktemp) $ unshare -Ur setcap cap_sys_admin-eip $v $ unshare -Ur setcap -v cap_sys_admin-eip $v /tmp/tmp.lSiIFRvt8Y: OK "setcap -v" verifies the values instead of setting them, and will check whether the rootid value is set. Therefore, with this bug un-fixed, and with CONFIG_SECURITY=n, setcap -v will fail: $ v=$(mktemp) $ unshare -Ur setcap cap_sys_admin=eip $v $ unshare -Ur setcap -v cap_sys_admin=eip $v nsowner[got=1000, want=0],/tmp/tmp.HHDiOOl9fY differs in [] Fix this bug by calling cap_inode_getsecurity() in security_inode_getsecurity() instead of returning -EOPNOTSUPP, when CONFIG_SECURITY=n. *1 - note, if libcap is too old for getcap to have the '-n' option, then use verify-caps instead. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=209689 Cc: Hervé Guillemet <herve@guillemet.org> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Serge Hallyn <shallyn@cisco.com> Signed-off-by: Andrew G. Morgan <morgan@kernel.org> Signed-off-by: James Morris <jamorris@linux.microsoft.com>
2020-12-04nfp: Replace zero-length array with flexible-array memberSimon Horman
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use "flexible array members"[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.9/process/deprecated.html#zero-length-and-one-element-arrays Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Louis Peens <louis.peens@netronome.com> Link: https://lore.kernel.org/r/20201204125601.24876-1-simon.horman@netronome.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-04openvswitch: fix error return code in validate_and_copy_dec_ttl()Wang Hai
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Changing 'return start' to 'return action_start' can fix this bug. Fixes: 69929d4c49e1 ("net: openvswitch: fix TTL decrement action netlink message format") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Reviewed-by: Eelco Chaudron <echaudro@redhat.com> Link: https://lore.kernel.org/r/20201204114314.1596-1-wanghai38@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-04net: bridge: vlan: fix error return code in __vlan_add()Zhang Changzhong
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: f8ed289fab84 ("bridge: vlan: use br_vlan_(get|put)_master to deal with refcounts") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Link: https://lore.kernel.org/r/1607071737-33875-1-git-send-email-zhangchangzhong@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-04ipv4: fix error return code in rtm_to_fib_config()Zhang Changzhong
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: d15662682db2 ("ipv4: Allow ipv6 gateway with ipv4 routes") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/1607071695-33740-1-git-send-email-zhangchangzhong@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-04nfc: s3fwrn5: skip the NFC bootloader modeBongsu Jeon
If there isn't a proper NFC firmware image, Bootloader mode will be skipped. Signed-off-by: Bongsu Jeon <bongsu.jeon@samsung.com> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Link: https://lore.kernel.org/r/20201203225257.2446-1-bongsu.jeon@samsung.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-04ASoC: qcom: fix QDSP6 dependencies, attempt #3Arnd Bergmann
The previous fix left another warning in randconfig builds: WARNING: unmet direct dependencies detected for SND_SOC_QDSP6 Depends on [n]: SOUND [=y] && !UML && SND [=y] && SND_SOC [=y] && SND_SOC_QCOM [=y] && QCOM_APR [=y] && COMMON_CLK [=n] Selected by [y]: - SND_SOC_MSM8996 [=y] && SOUND [=y] && !UML && SND [=y] && SND_SOC [=y] && SND_SOC_QCOM [=y] && QCOM_APR [=y] Add one more dependency for this one. Fixes: 2bc8831b135c ("ASoC: qcom: fix SDM845 & QDSP6 dependencies more") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Link: https://lore.kernel.org/r/20201203231443.1483763-1-arnd@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
2020-12-04ASoC: fsl_aud2htx: mark PM functions as __maybe_unusedArnd Bergmann
When CONFIG_PM is disabled, we get a warning for unused functions: sound/soc/fsl/fsl_aud2htx.c:261:12: error: unused function 'fsl_aud2htx_runtime_suspend' [-Werror,-Wunused-function] static int fsl_aud2htx_runtime_suspend(struct device *dev) sound/soc/fsl/fsl_aud2htx.c:271:12: error: unused function 'fsl_aud2htx_runtime_resume' [-Werror,-Wunused-function] static int fsl_aud2htx_runtime_resume(struct device *dev) Mark these as __maybe_unused to avoid the warning without adding an #ifdef. Fixes: 8a24c834c053 ("ASoC: fsl_aud2htx: Add aud2htx module driver") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Nicolin Chen <nicoleotsuka@gmail.com> Link: https://lore.kernel.org/r/20201203222900.1042578-1-arnd@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
2020-12-04ASoC: intel: sof_rt5682: Add support for tgl_rt1011_rt5682Brent Lu
This patch adds the driver data for two rt1011 speaker amplifiers on SSP1 and rt5682 on SSP0 for TGL platform. DAI format for rt1011 is leveraged from cml_rt1011_rt5682 which is 4-slot tdm with 100fs bclk. Signed-off-by: Brent Lu <brent.lu@intel.com> Acked-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20201203154010.29464-1-brent.lu@intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-12-04ASoC: atmel: mchp-spdifrx needs COMMON_CLKArnd Bergmann
Compile-testing this driver on an older platform without CONFIG_COMMON_CLK fails with ERROR: modpost: "clk_set_min_rate" [sound/soc/atmel/snd-soc-mchp-spdifrx.ko] undefined! Make this is a strict dependency. Fixes: ef265c55c1ac ("ASoC: mchp-spdifrx: add driver for SPDIF RX") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Codrin Ciubotariu <codrin.ciubotariu@microchip.com> Link: https://lore.kernel.org/r/20201203223815.1353451-1-arnd@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
2020-12-04ASoC: cros_ec_codec: fix uninitialized memory readArnd Bergmann
gcc points out a memory area that is copied to a device but not initialized: sound/soc/codecs/cros_ec_codec.c: In function 'i2s_rx_event': arch/x86/include/asm/string_32.h:83:20: error: '*((void *)&p+4)' may be used uninitialized in this function [-Werror=maybe-uninitialized] 83 | *((int *)to + 1) = *((int *)from + 1); Initialize all the unused fields to zero. Fixes: 727f1c71c780 ("ASoC: cros_ec_codec: refactor I2S RX") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Tzung-Bi Shih <tzungbi@google.com> Link: https://lore.kernel.org/r/20201203225458.1477830-1-arnd@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
2020-12-04ASoC: SOF: control: fix cppcheck warning in snd_sof_volume_info()Pierre-Louis Bossart
Fix cppcheck warning: sound/soc/sof/control.c:117:82: style:inconclusive: Function 'snd_sof_volume_info' argument 2 names different: declaration 'ucontrol' definition 'uinfo'. [funcArgNamesDifferent] Fixes: fca18e62984a ("ASoC: SOF: control: override volume info callback") Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Link: https://lore.kernel.org/r/20201204170313.2704499-1-kai.vehmanen@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2020-12-04ethernet: select CONFIG_CRC32 as neededArnd Bergmann
A number of ethernet drivers require crc32 functionality to be avaialable in the kernel, causing a link error otherwise: arm-linux-gnueabi-ld: drivers/net/ethernet/agere/et131x.o: in function `et1310_setup_device_for_multicast': et131x.c:(.text+0x5918): undefined reference to `crc32_le' arm-linux-gnueabi-ld: drivers/net/ethernet/cadence/macb_main.o: in function `macb_start_xmit': macb_main.c:(.text+0x4b88): undefined reference to `crc32_le' arm-linux-gnueabi-ld: drivers/net/ethernet/faraday/ftgmac100.o: in function `ftgmac100_set_rx_mode': ftgmac100.c:(.text+0x2b38): undefined reference to `crc32_le' arm-linux-gnueabi-ld: drivers/net/ethernet/freescale/fec_main.o: in function `set_multicast_list': fec_main.c:(.text+0x6120): undefined reference to `crc32_le' arm-linux-gnueabi-ld: drivers/net/ethernet/freescale/fman/fman_dtsec.o: in function `dtsec_add_hash_mac_address': fman_dtsec.c:(.text+0x830): undefined reference to `crc32_le' arm-linux-gnueabi-ld: drivers/net/ethernet/freescale/fman/fman_dtsec.o:fman_dtsec.c:(.text+0xb68): more undefined references to `crc32_le' follow arm-linux-gnueabi-ld: drivers/net/ethernet/netronome/nfp/nfpcore/nfp_hwinfo.o: in function `nfp_hwinfo_read': nfp_hwinfo.c:(.text+0x250): undefined reference to `crc32_be' arm-linux-gnueabi-ld: nfp_hwinfo.c:(.text+0x288): undefined reference to `crc32_be' arm-linux-gnueabi-ld: drivers/net/ethernet/netronome/nfp/nfpcore/nfp_resource.o: in function `nfp_resource_acquire': nfp_resource.c:(.text+0x144): undefined reference to `crc32_be' arm-linux-gnueabi-ld: nfp_resource.c:(.text+0x158): undefined reference to `crc32_be' arm-linux-gnueabi-ld: drivers/net/ethernet/nxp/lpc_eth.o: in function `lpc_eth_set_multicast_list': lpc_eth.c:(.text+0x1934): undefined reference to `crc32_le' arm-linux-gnueabi-ld: drivers/net/ethernet/rocker/rocker_ofdpa.o: in function `ofdpa_flow_tbl_do': rocker_ofdpa.c:(.text+0x2e08): undefined reference to `crc32_le' arm-linux-gnueabi-ld: drivers/net/ethernet/rocker/rocker_ofdpa.o: in function `ofdpa_flow_tbl_del': rocker_ofdpa.c:(.text+0x3074): undefined reference to `crc32_le' arm-linux-gnueabi-ld: drivers/net/ethernet/rocker/rocker_ofdpa.o: in function `ofdpa_port_fdb': arm-linux-gnueabi-ld: drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.o: in function `mlx5dr_ste_calc_hash_index': dr_ste.c:(.text+0x354): undefined reference to `crc32_le' arm-linux-gnueabi-ld: drivers/net/ethernet/microchip/lan743x_main.o: in function `lan743x_netdev_set_multicast': lan743x_main.c:(.text+0x5dc4): undefined reference to `crc32_le' Add the missing 'select CRC32' entries in Kconfig for each of them. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Acked-by: Mark Einon <mark.einon@gmail.com> Acked-by: Simon Horman <simon.horman@netronome.com> Link: https://lore.kernel.org/r/20201203232114.1485603-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-04net: ipa: pass the correct size when freeing DMA memoryAlex Elder
When the coherent memory is freed in gsi_trans_pool_exit_dma(), we are mistakenly passing the size of a single element in the pool rather than the actual allocated size. Fix this bug. Fixes: 9dd441e4ed575 ("soc: qcom: ipa: GSI transactions") Reported-by: Stephen Boyd <swboyd@chromium.org> Tested-by: Sujit Kautkar <sujitka@chromium.org> Signed-off-by: Alex Elder <elder@linaro.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Link: https://lore.kernel.org/r/20201203215106.17450-1-elder@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-04ARM: highmem: Fix cache_is_vivt() referenceArnd Bergmann
The reference to cache_is_vivt() was moved into a header file, which now causes a build failure in rare randconfig builds: arch/arm/include/asm/highmem.h:52:43: error: implicit declaration of function 'cache_is_vivt' [-Werror,-Wimplicit-function-declaration] Add an explicit include to make it build reliably. Fixes: 2a15ba82fa6c ("ARM: highmem: Switch to generic kmap atomic") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Russell King <linux@armlinux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ira Weiny <ira.weiny@intel.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20201204165930.3877571-1-arnd@kernel.org
2020-12-04block: fix incorrect branching in blk_max_size_offset()Mike Snitzer
If non-zero 'chunk_sectors' is passed in to blk_max_size_offset() that override will be incorrectly ignored. Old blk_max_size_offset() branching, prior to commit 3ee16db390b4, must be used only if passed 'chunk_sectors' override is zero. Fixes: 3ee16db390b4 ("dm: fix IO splitting") Cc: stable@vger.kernel.org # 5.9 Reported-by: John Dorminy <jdorminy@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-12-04net/sched: fq_pie: initialize timer earlier in fq_pie_init()Davide Caratti
with the following tdc testcase: 83be: (qdisc, fq_pie) Create FQ-PIE with invalid number of flows as fq_pie_init() fails, fq_pie_destroy() is called to clean up. Since the timer is not yet initialized, it's possible to observe a splat like this: INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 0 PID: 975 Comm: tc Not tainted 5.10.0-rc4+ #298 Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014 Call Trace: dump_stack+0x99/0xcb register_lock_class+0x12dd/0x1750 __lock_acquire+0xfe/0x3970 lock_acquire+0x1c8/0x7f0 del_timer_sync+0x49/0xd0 fq_pie_destroy+0x3f/0x80 [sch_fq_pie] qdisc_create+0x916/0x1160 tc_modify_qdisc+0x3c4/0x1630 rtnetlink_rcv_msg+0x346/0x8e0 netlink_unicast+0x439/0x630 netlink_sendmsg+0x719/0xbf0 sock_sendmsg+0xe2/0x110 ____sys_sendmsg+0x5ba/0x890 ___sys_sendmsg+0xe9/0x160 __sys_sendmsg+0xd3/0x170 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 [...] ODEBUG: assert_init not available (active state 0) object type: timer_list hint: 0x0 WARNING: CPU: 0 PID: 975 at lib/debugobjects.c:508 debug_print_object+0x162/0x210 [...] Call Trace: debug_object_assert_init+0x268/0x380 try_to_del_timer_sync+0x6a/0x100 del_timer_sync+0x9e/0xd0 fq_pie_destroy+0x3f/0x80 [sch_fq_pie] qdisc_create+0x916/0x1160 tc_modify_qdisc+0x3c4/0x1630 rtnetlink_rcv_msg+0x346/0x8e0 netlink_rcv_skb+0x120/0x380 netlink_unicast+0x439/0x630 netlink_sendmsg+0x719/0xbf0 sock_sendmsg+0xe2/0x110 ____sys_sendmsg+0x5ba/0x890 ___sys_sendmsg+0xe9/0x160 __sys_sendmsg+0xd3/0x170 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 fix it moving timer_setup() before any failure, like it was done on 'red' with former commit 608b4adab178 ("net_sched: initialize timer earlier in red_init()"). Fixes: ec97ecf1ebe4 ("net: sched: add Flow Queue PIE packet scheduler") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Cong Wang <cong.wang@bytedance.com> Link: https://lore.kernel.org/r/2e78e01c504c633ebdff18d041833cf2e079a3a4.1607020450.git.dcaratti@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-04Merge branch 'perf-optimizations-for-tcp-recv-zerocopy'Jakub Kicinski
Arjun Roy says: ==================== Perf. optimizations for TCP Recv. Zerocopy This patchset contains several optimizations for TCP Recv. Zerocopy. Summarized: 1. It is possible that a read payload is not exactly page aligned - that there may exist "straggler" bytes that we cannot map into the caller's address space cleanly. For this, we allow the caller to provide as argument a "hybrid copy buffer", turning getsockopt(TCP_ZEROCOPY_RECEIVE) into a "hybrid" operation that allows the caller to avoid a subsequent recvmsg() call to read the stragglers. 2. Similarly, for "small" read payloads that are either below the size of a page, or small enough that remapping pages is not a performance win - we allow the user to short-circuit the remapping operations entirely and simply copy into the buffer provided. Some of the patches in the middle of this set are refactors to support this "short-circuiting" optimization. 3. We allow the user to provide a hint that performing a page zap operation (and the accompanying TLB shootdown) may not be necessary, for the provided region that the kernel will attempt to map pages into. This allows us to avoid this expensive operation while holding the socket lock, which provides a significant performance advantage. With all of these changes combined, "medium" sized receive traffic (multiple tens to few hundreds of KB) see significant efficiency gains when using TCP receive zerocopy instead of regular recvmsg(). For example, with RPC-style traffic with 32KB messages, there is a roughly 15% efficiency improvement when using zerocopy. Without these changes, there is a roughly 60-70% efficiency reduction with such messages when employing zerocopy. ==================== Link: https://lore.kernel.org/r/20201202225349.935284-1-arjunroy.kdev@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-04net-zerocopy: Defer vm zap unless actually needed.Arjun Roy
Zapping pages is required only if we are calling vm_insert_page into a region where pages had previously been mapped. Receive zerocopy allows reusing such regions, and hitherto called zap_page_range() before calling vm_insert_page() in that range. zap_page_range() can also be triggered from userspace with madvise(MADV_DONTNEED). If userspace is configured to call this before reusing a segment, or if there was nothing mapped at this virtual address to begin with, we can avoid calling zap_page_range() under the socket lock. That said, if userspace does not do that, then we are still responsible for calling zap_page_range(). This patch adds a flag that the user can use to hint to the kernel that a zap is not required. If the flag is not set, or if an older user application does not have a flags field at all, then the kernel calls zap_page_range as before. Also, if the flag is set but a zap is still required, the kernel performs that zap as necessary. Thus incorrectly indicating that a zap can be avoided does not change the correctness of operation. It also increases the batchsize for vm_insert_pages and prefetches the page struct for the batch since we're about to bump the refcount. An alternative mechanism could be to not have a flag, assume by default a zap is not needed, and fall back to zapping if needed. However, this would harm performance for older applications for which a zap is necessary, and thus we implement it with an explicit flag so newer applications can opt in. When using RPC-style traffic with medium sized (tens of KB) RPCs, this change yields an efficency improvement of about 30% for QPS/CPU usage. Signed-off-by: Arjun Roy <arjunroy@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>