summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-11-04nvdimm/pmem: use add_disk() error handlingLuis Chamberlain
Now that device_add_disk() supports returning an error, use that. We must unwind alloc_dax() on error. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20211103230437.1639990-7-mcgrof@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-04nvdimm/pmem: cleanup the disk if pmem_release_disk() is yet assignedLuis Chamberlain
Prior to devm being able to use pmem_release_disk() there are other failure which can occur for which we must account for and release the disk for. Address those few cases. Fixes: 3dd60fb9d95d ("nvdimm/pmem: stop using q_usage_count as external pgmap refcount") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20211103230437.1639990-6-mcgrof@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-04nvdimm/blk: add error handling support for add_disk()Luis Chamberlain
We never checked for errors on add_disk() as this function returned void. Now that this is fixed, use the shiny new error handling. Since nvdimm/blk uses devm we just need to move the devm registration towards the end. And in hindsight, that seems to also provide a fix given del_gendisk() should not be called unless the disk was already added via add_disk(). The probably of that issue happening is low though, like OOM while calling devm_add_action(), so the fix is minor. We manually unwind in case of add_disk() failure prior to the devm registration. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20211103230437.1639990-5-mcgrof@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-04nvdimm/blk: avoid calling del_gendisk() on early failuresLuis Chamberlain
If nd_integrity_init() fails we'd get del_gendisk() called, but that's not correct as we should only call that if we're done with device_add_disk(). Fix this by providing unwinding prior to the devm call being registered and moving the devm registration to the very end. This should fix calling del_gendisk() if nd_integrity_init() fails. I only spotted this issue through code inspection. It does not fix any real world bug. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20211103230437.1639990-4-mcgrof@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-04nvdimm/btt: add error handling support for add_disk()Luis Chamberlain
We never checked for errors on add_disk() as this function returned void. Now that this is fixed, use the shiny new error handling. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20211103230437.1639990-3-mcgrof@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-04nvdimm/btt: use goto error labels on btt_blk_init()Luis Chamberlain
This will make it easier to share common error paths. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20211103230437.1639990-2-mcgrof@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-04loop: Remove duplicate assignmentsluo penghao
The assignment and operation there will be overwritten later, so it should be deleted. The clang_analyzer complains as follows: drivers/block/loop.c:2330:2 warning: Value stored to 'err' is never read change in v2: Repair the sending email box Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: luo penghao <luo.penghao@zte.com.cn> Link: https://lore.kernel.org/r/20211104064546.3074-1-luo.penghao@zte.com.cn Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-04drbd: Fix double free problem in drbd_create_deviceWu Bo
In drbd_create_device(), the 'out_no_io_page' lable has called blk_cleanup_disk() when return failed. So remove the 'out_cleanup_disk' lable to avoid double free the disk pointer. Fixes: e92ab4eda516 ("drbd: add error handling support for add_disk()") Signed-off-by: Wu Bo <wubo40@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/1636013229-26309-1-git-send-email-wubo40@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-04net: fix possible NULL deref in sock_reserve_memoryEric Dumazet
Sanity check in sock_reserve_memory() was not enough to prevent malicious user to trigger a NULL deref. In this case, the isse is that sk_prot->memory_allocated is NULL. Use standard sk_has_account() helper to deal with this. BUG: KASAN: null-ptr-deref in instrument_atomic_read_write include/linux/instrumented.h:101 [inline] BUG: KASAN: null-ptr-deref in atomic_long_add_return include/linux/atomic/atomic-instrumented.h:1218 [inline] BUG: KASAN: null-ptr-deref in sk_memory_allocated_add include/net/sock.h:1371 [inline] BUG: KASAN: null-ptr-deref in sock_reserve_memory net/core/sock.c:994 [inline] BUG: KASAN: null-ptr-deref in sock_setsockopt+0x22ab/0x2b30 net/core/sock.c:1443 Write of size 8 at addr 0000000000000000 by task syz-executor.0/11270 CPU: 1 PID: 11270 Comm: syz-executor.0 Not tainted 5.15.0-syzkaller #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 __kasan_report mm/kasan/report.c:446 [inline] kasan_report.cold+0x66/0xdf mm/kasan/report.c:459 check_region_inline mm/kasan/generic.c:183 [inline] kasan_check_range+0x13d/0x180 mm/kasan/generic.c:189 instrument_atomic_read_write include/linux/instrumented.h:101 [inline] atomic_long_add_return include/linux/atomic/atomic-instrumented.h:1218 [inline] sk_memory_allocated_add include/net/sock.h:1371 [inline] sock_reserve_memory net/core/sock.c:994 [inline] sock_setsockopt+0x22ab/0x2b30 net/core/sock.c:1443 __sys_setsockopt+0x4f8/0x610 net/socket.c:2172 __do_sys_setsockopt net/socket.c:2187 [inline] __se_sys_setsockopt net/socket.c:2184 [inline] __x64_sys_setsockopt+0xba/0x150 net/socket.c:2184 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f56076d5ae9 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f5604c4b188 EFLAGS: 00000246 ORIG_RAX: 0000000000000036 RAX: ffffffffffffffda RBX: 00007f56077e8f60 RCX: 00007f56076d5ae9 RDX: 0000000000000049 RSI: 0000000000000001 RDI: 0000000000000003 RBP: 00007f560772ff25 R08: 000000000000fec7 R09: 0000000000000000 R10: 0000000020000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fffb61a100f R14: 00007f5604c4b300 R15: 0000000000022000 </TASK> Fixes: 2bb2f5fb21b0 ("net: add new socket option SO_RESERVE_MEM") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Wei Wang <weiwan@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-04tcp: Use BIT() for OPTION_* constantsLeonard Crestez
Extending these flags using the existing (1 << x) pattern triggers complaints from checkpatch. Instead of ignoring checkpatch modify the existing values to use BIT(x) style in a separate commit. Signed-off-by: Leonard Crestez <cdleonard@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-04selftests: net: properly support IPv6 in GSO GRE testAndrea Righi
Explicitly pass -6 to netcat when the test is using IPv6 to prevent failures. Also make sure to pass "-N" to netcat to close the socket after EOF on the client side, otherwise we would always hit the timeout and the test would fail. Without this fix applied: TEST: GREv6/v4 - copy file w/ TSO [FAIL] TEST: GREv6/v4 - copy file w/ GSO [FAIL] TEST: GREv6/v6 - copy file w/ TSO [FAIL] TEST: GREv6/v6 - copy file w/ GSO [FAIL] With this fix applied: TEST: GREv6/v4 - copy file w/ TSO [ OK ] TEST: GREv6/v4 - copy file w/ GSO [ OK ] TEST: GREv6/v6 - copy file w/ TSO [ OK ] TEST: GREv6/v6 - copy file w/ GSO [ OK ] Fixes: 025efa0a82df ("selftests: add simple GSO GRE test") Signed-off-by: Andrea Righi <andrea.righi@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-04parisc: move CPU field back into thread_infoArd Biesheuvel
In commit 2214c0e77259 ("parisc: Move thread_info into task struct") PA-RISC gained support for THREAD_INFO_IN_TASK while changes were already underway to keep the CPU field in thread_info rather than move it into task_struct when THREAD_INFO_IN_TASK is enabled. The result is a broken build for all PA-RISC configs that enable SMP. So let's partially revert that commit, and get rid of the ugly hack to get at the offset of task_struct::cpu without having to include linux/sched.h, and put the CPU field back where it was before. Reported-by: Guenter Roeck <linux@roeck-us.net> Fixes: bcf9033e5449 ("sched: move CPU field back into thread_info if THREAD_INFO_IN_TASK=y") Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Helge Deller <deller@gmx.de>
2021-11-04parisc: Don't disable interrupts in cmpxchg and futex operationsDave Anglin
I no longer think interrupts can be disabled in the futex and cmpxchg operations because of COW breaks. This not ideal but I suspect it's the best we can do. For the cmpxchg operations in syscall.S, we rely on the code to not schedule off the gateway page. For the futex, I added code to disable preemption. So far, I haven't seen the warnings with the attached change but the change is only lightly tested. Signed-off-by: Dave Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
2021-11-04parisc: don't enable irqs unconditionally in handle_interruption()Sven Schnelle
If the previous context had interrupts disabled, we should better keep them disabled. This was noticed in the unwinding code where a copy_from_kernel_nofault() triggered a page fault, and after the fixup by the page fault handler interrupts where suddenly enabled. Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2021-11-04ovl: fix warning in ovl_create_real()Miklos Szeredi
Syzbot triggered the following warning in ovl_workdir_create() -> ovl_create_real(): if (!err && WARN_ON(!newdentry->d_inode)) { The reason is that the cgroup2 filesystem returns from mkdir without instantiating the new dentry. Weird filesystems such as this will be rejected by overlayfs at a later stage during setup, but to prevent such a warning, call ovl_mkdir_real() directly from ovl_workdir_create() and reject this case early. Reported-and-tested-by: syzbot+75eab84fd0af9e8bf66b@syzkaller.appspotmail.com Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-11-03Merge tag 'clk-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk updates from Stephen Boyd: "The usual collection of clk driver updates and new driver additions. In terms of lines it's mainly Qualcomm and Mediatek code, supporting various SoCs and their multitude of clk controllers. New Drivers: - GCC and RPMcc support for Qualcomm QCM2290 SoCs - GCC support for Qualcomm MSM8994/MSM8992 SoCs - LPASSCC and CAMCC support for Qualcomm SC7280 SoCs - Support for Mediatek MT8195 SoCs - Initial clock driver for the Exynos850 SoC - Add i.MX8ULP clock driver and related bindings Updates: - Clock power management for new SAMA7G5 SoC - Updates to the master clock driver and sam9x60-pll to be able to use cpufreq-dt driver and avoid overclocking of CPU and MCK0 domains while changing the frequency via DVFS - Use ARRAY_SIZE in qcom clk drivers - Remove some impractical fallback parent names in qcom clk drivers - Make Mediatek clk drivers tristate - Refactoring of the CPU clock code and conversion of Samsung Exynos5433 CPU clock driver to the platform driver - A few conversions to devm_platform_ioremap_resource() - Updates of the Samsung Kconfig help text - Update video path realted clocks for Amlogic meson8 - Add SPI Multi I/O Bus and SDHI clocks and resets on Renesas RZ/G2L - Add SPI Multi I/O Bus (RPC) clocks on Renesas R-Car V3U - Add MediaLB clocks on Renesas R-Car H3, M3-W/W+, and M3-N - Remove unused helpers from i.MX specific clock header - Rework all i.MX clk based helpers to use clk_hw based ones - Rework i.MX gate/mux/divider wrappers - Rework imx_clk_hw_composite and imx_clk_hw_pll14xx wrappers - Update i.MX pllv4 and composite clocks to support i.MX8ULP - Disable i.MX7ULP composite clock during initialization - Add CLK_SET_RATE_NO_REPARENT flag to the i.MX7ULP composite - Disable the i.MX pfd when set pfdv2 clock rate - Add support for i.MX8ULP in pfdv2 - Add the pcc reset controller support on i.MX8ULP - Fix the build break when clk-imx8ulp is built as module - Move csi_sel mux to correct base register in i.MX6UL clock drivr - Fix csi clk gate register in i.MX6UL clock driver - Fix build bug making CLK_IMX8ULP select MXC_CLK - Add TPU (PWM), and Z (Cortex-A76) clocks on Renesas R-Car V3U - Add Ethernet clocks on Renesas RZ/G2L - Move Rockchip to use module_platform_probe - Enable usage of Coresight related clocks on Rockchip rk3399" * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (170 commits) clk: use clk_core_get_rate_recalc() in clk_rate_get() clk: at91: sama7g5: set low limit for mck0 at 32KHz clk: at91: sama7g5: remove prescaler part of master clock clk: at91: clk-master: add notifier for divider clk: at91: clk-sam9x60-pll: add notifier for div part of PLL clk: at91: clk-master: fix prescaler logic clk: at91: clk-master: mask mckr against layout->mask clk: at91: clk-master: check if div or pres is zero clk: at91: sam9x60-pll: use DIV_ROUND_CLOSEST_ULL clk: at91: pmc: add sama7g5 to the list of available pmcs clk: at91: clk-master: improve readability by using local variables clk: at91: clk-master: add register definition for sama7g5's master clock clk: at91: sama7g5: add securam's peripheral clock clk: at91: pmc: execute suspend/resume only for backup mode clk: at91: re-factor clocks suspend/resume clk: ux500: Add driver for the reset portions of PRCC dt-bindings: clock: u8500: Rewrite in YAML and extend clk: composite: Use rate_ops.determine_rate when also a mux is available clk: samsung: describe drivers in Kconfig clk: samsung: exynos5433: update apollo and atlas clock probing ...
2021-11-03Merge tag 'defconfig-5.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM defconfig updates from Arnd Bergmann: "These are the usual changes to enable newly added driver by default, and to do some housekeeping around changing Kconfig symbols" * tag 'defconfig-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (33 commits) arm64: defconfig: Enable Qualcomm LMH driver arm64: defconfig: Enable Qualcomm prima/pronto drivers arm64: defconfig: Enable Sleep stats driver arm64: defconfig: Visconti: Enable PCIe host controller ARM: configs: aspeed: Remove unused USB gadget devices ARM: config: aspeed: Enable Network Block Device ARM: configs: aspeed: Enable pstore and lockup detectors ARM: configs: aspeed: Enable commonly used drivers ARM: configs: aspeed: Disable IPV6 SIT device ARM: imx_v6_v7_defconfig: Enable HID I2C arm64: defconfig: Enable QTI SC7280 pinctrl, gcc and interconnect arm64: defconfig: Disable firmware sysfs fallback ARM: mvebu_v7_defconfig: rebuild default configuration ARM: mvebu_v7_defconfig: enable mtd physmap arm64: defconfig: Enable few Tegra210 based AHUB drivers arm64: defconfig: drop obsolete ARCH_* configs ARM: imx_v6_v7_defconfig: enable bpf syscall and cgroup bpf ARM: imx_v6_v7_defconfig: build imx sdma driver as module ARM: imx_v6_v7_defconfig: rebuild default configuration ARM: imx_v6_v7_defconfig: change snd soc tlv320aic3x to i2c variant ...
2021-11-03Merge tag 'drivers-5.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC driver updates from Arnd Bergmann: "These are all the driver updates for SoC specific drivers. There are a couple of subsystems with individual maintainers picking up their patches here: - The reset controller subsystem add support for a few new SoC variants to existing drivers, along with other minor improvements - The OP-TEE subsystem gets a driver for the ARM FF-A transport - The memory controller subsystem has improvements for Tegra, Mediatek, Renesas, Freescale and Broadcom specific drivers. - The tegra cpuidle driver changes get merged through this tree this time. There are only minor changes, but they depend on other tegra driver updates here. - The ep93xx platform finally moves to using the drivers/clk/ subsystem, moving the code out of arch/arm in the process. This depends on a small sound driver change that is included here as well. - There are some minor updates for Qualcomm and Tegra specific firmware drivers. The other driver updates are mainly for drivers/soc, which contains a mixture of vendor specific drivers that don't really fit elsewhere: - Mediatek drivers gain more support for MT8192, with new support for hw-mutex and mmsys routing, plus support for reset lines in the mmsys driver. - Qualcomm gains a new "sleep stats" driver, and support for the "Generic Packet Router" in the APR driver. - There is a new user interface for routing the UARTS on ASpeed BMCs, something that apparently nobody else has needed so far. - More drivers can now be built as loadable modules, in particular for Broadcom and Samsung platforms. - Lots of improvements to the TI sysc driver for better suspend/resume support" Finally, there are lots of minor cleanups and new device IDs for amlogic, renesas, tegra, qualcomm, mediateka, samsung, imx, layerscape, allwinner, broadcom, and omap" * tag 'drivers-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (179 commits) optee: Fix spelling mistake "reclain" -> "reclaim" Revert "firmware: qcom: scm: Add support for MC boot address API" qcom: spm: allow compile-testing firmware: arm_ffa: Remove unused 'compat_version' variable soc: samsung: exynos-chipid: add exynosautov9 SoC support firmware: qcom: scm: Don't break compile test on non-ARM platforms soc: qcom: smp2p: Add of_node_put() before goto soc: qcom: apr: Add of_node_put() before return soc: qcom: qcom_stats: Fix client votes offset soc: qcom: rpmhpd: fix sm8350_mxc's peer domain dt-bindings: arm: cpus: Document qcom,msm8916-smp enable-method ARM: qcom: Add qcom,msm8916-smp enable-method identical to MSM8226 firmware: qcom: scm: Add support for MC boot address API soc: qcom: spm: Add 8916 SPM register data dt-bindings: soc: qcom: spm: Document qcom,msm8916-saw2-v3.0-cpu soc: qcom: socinfo: Add PM8150C and SMB2351 models firmware: qcom_scm: Fix error retval in __qcom_scm_is_call_available() soc: aspeed: Add UART routing support soc: fsl: dpio: rename the enqueue descriptor variable soc: fsl: dpio: use an explicit NULL instead of 0 ...
2021-11-03Merge tag 'dt-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/socLinus Torvalds
Pull ARM SoC DT updates from Arnd Bergmann: "This is a rather large update for the ARM devicetree files, after a few quieter releases, with 775 total commits and 47 branches pulled into this one. There are 5 new SoC types plus some minor variations, and a total of 60 new machines, so I'm limiting the summary to the main noteworthy items: - Apple M1 gain support for PCI and pinctrl, getting a bit closer to a usable system out of the box. - Qualcomm gains support for Snapdragon 690 (aka SM6350) as well as SM7225, 11 new smartphones, and three additional Chromebooks, and improvements all over the place. - Samsung gains support for ExynosAutov9, an automotive version of their smartphone SoC, but otherwise no major changes. - Microchip adds the SAMA5D29 SoC in the SAMA5 family, and a number of improvements for the recently added SAMA7 family. The LAN966 SoC that was added in the platform code does not have dts files yet. Two board files are added for the older at91sam9g20 SoC - Aspeed supports two additional server boards using their AST2600 as BMC, and improves support for qemu models - Rockchip RK3566/RK3688 gets added, along with six new development boards using RK3328/RK3399/RK3566, and one Chromebook tablet. - Two NAS boxes are added using the ARMv4 based Gemini platform - One new board is added to the Intel Arria SoC FPGA family - Marvell adds one network switch based on Armada 381 and the new MOCHAbin 7040 development board - NXP adds support for the S32G2 automotive SoC, two imx6 based ebook readers, and three additional development boards, which is notably less than their usual additions, but they also gain improvements to their many existing boards - STmicroelectronics adds their stm32mp13 SoC family along with a reference board - Renesas adds new versions of their R-Car Gen3 SoCs and many updates for their older generations - Broadcom adds support for a number of Cisco Meraki wireless controllers, along with two new boards and other updates for BCM53xx/BCM47xx networking SoCs and the Raspberry Pi boards - Mediatek improves support for the MT81xx SoCs used in Chromebooks as well as the MT76xx networking SoCs - NVIDIA adds a number of cleanups and additional support for more hardware on the already supported machines - TI K3 adds support for three new boards along with cleanups - Toshiba adds one board for the Visconti family - Xilinx adds five new ZynqMP based machines - Amlogic support is added for the Radxa Zero and two Jethub home automation controllers, along with changes to other machines - Rob Herring continues his work on fixing dtc warnings all over the tree. - Minor updates for TI OMAP, Mstar, Allwinner/sunxi, Hisilicon, Ux500, Unisoc" * tag 'dt-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (720 commits) arm64: dts: apple: j274: Expose PCI node for the Ethernet MAC address arm64: dts: apple: t8103: Add root port interrupt routing arm64: dts: apple: t8103: Add PCIe DARTs arm64: apple: Add PCIe node arm64: apple: Add pinctrl nodes ARM: dts: arm: Update ICST clock nodes 'reg' and node names ARM: dts: arm: Update register-bit-led nodes 'reg' and node names arm64: dts: exynos: add chipid node for exynosautov9 SoC ARM: dts: qcom: fix typo in IPQ8064 thermal-sensor node Revert "arm64: dts: qcom: msm8916-asus-z00l: Add sensors" arm64: dts: qcom: ipq6018: Remove unused 'iface_clk' property from dma-controller node arm64: dts: qcom: ipq6018: Remove unused 'qcom,config-pipe-trust-reg' property arm64: dts: qcom: sm8350: Add CPU topology and idle-states arm64: dts: qcom: Drop unneeded extra device-specific includes arm64: dts: qcom: msm8916: Drop standalone smem node arm64: dts: qcom: Fix node name of rpm-msg-ram device nodes arm64: dts: qcom: msm8916-asus-z00l: Add sensors arm64: dts: qcom: msm8916-asus-z00l: Add SDCard arm64: dts: qcom: msm8916-asus-z00l: Add touchscreen arm64: dts: qcom: sdm845-oneplus: remove devinfo-size from ramoops node ...
2021-11-03Merge tag 'soc-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/socLinus Torvalds
Pull ARM SoC updates from Arnd Bergmann: "The SoC updates this time are mainly removing obsolete code from the OMAP2 platform, another step in the eternal cleanup of that platform. There are two new SoCs getting added: STMicroelectronics stm32mp13 and Microchip lan966. Both fit into existing platforms and require minimal changes here. A couple of MAINTAINER file updates relate to those changes, and update some file paths" * tag 'soc-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (28 commits) MAINTAINERS: Update BCM7XXX entry with additional patterns MAINTAINERS: add pinctrl-apple-gpio to ARM/APPLE MACHINE MAINTAINERS: Add pasemi i2c to ARM/APPLE MACHINE ARM: SPEAr: Update MAINTAINERS entries ARM: OMAP2+: Drop unused CM defines for am3 ARM: OMAP2+: Drop unused CM and SCRM defines for omap4 ARM: OMAP2+: Drop unused CM and SCRM defines for omap5 ARM: OMAP2+: Drop unused CM defines for dra7 ARM: OMAP2+: Drop unused PRM defines for am3 ARM: OMAP2+: Drop unused PRM defines for am4 ARM: OMAP2+: Drop unused PRM defines for omap4 ARM: OMAP2+: Drop unused PRM defines for omap5 ARM: OMAP2+: Drop unused PRM defines for dra7 ARM: OMAP2+: Fix comment typo ARM: OMAP2+: Fix typo in some comments ARM: at91: add basic support for new SoC family lan966 dt-bindings: arm: at91: Document lan966 pcb8291 and pcb8290 boards ARM: at91: Documentation: add lan966 family ARM: at91: Documentation: add sama7g5 family MAINTAINERS: add an entry for NXP S32G boards ...
2021-11-03Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds
Pull virtio updates from Michael Tsirkin: "vhost and virtio fixes and features: - Hardening work by Jason - vdpa driver for Alibaba ENI - Performance tweaks for virtio blk - virtio rng rework using an internal buffer - mac/mtu programming for mlx5 vdpa - Misc fixes, cleanups" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (45 commits) vdpa/mlx5: Forward only packets with allowed MAC address vdpa/mlx5: Support configuration of MAC vdpa/mlx5: Fix clearing of VIRTIO_NET_F_MAC feature bit vdpa_sim_net: Enable user to set mac address and mtu vdpa: Enable user to set mac and mtu of vdpa device vdpa: Use kernel coding style for structure comments vdpa: Introduce query of device config layout vdpa: Introduce and use vdpa device get, set config helpers virtio-scsi: don't let virtio core to validate used buffer length virtio-blk: don't let virtio core to validate used length virtio-net: don't let virtio core to validate used length virtio_ring: validate used buffer length virtio_blk: correct types for status handling virtio_blk: allow 0 as num_request_queues i2c: virtio: Add support for zero-length requests virtio-blk: fixup coccinelle warnings virtio_ring: fix typos in vring_desc_extra virtio-pci: harden INTX interrupts virtio_pci: harden MSI-X interrupts virtio_config: introduce a new .enable_cbs method ...
2021-11-03PCI: cadence: Add cdns_plat_pcie_probe() missing returnLi Chen
When cdns_plat_pcie_probe() succeeds, return success instead of falling into the error handling code. Fixes: bd22885aa188 ("PCI: cadence: Refactor driver to use as a core library") Link: https://lore.kernel.org/r/DM6PR19MB40271B93057D949310F0B0EDA0BF9@DM6PR19MB4027.namprd19.prod.outlook.com Signed-off-by: Xuliang Zhang <xlzhanga@ambarella.com> Signed-off-by: Li Chen <lchen@ambarella.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org
2021-11-03Merge tag 'vfio-v5.16-rc1' of git://github.com/awilliam/linux-vfioLinus Torvalds
Pull VFIO updates from Alex Williamson: - Cleanup vfio iommu_group creation (Christoph Hellwig) - Add individual device reset for vfio/fsl-mc (Diana Craciun) - IGD OpRegion 2.0+ support (Colin Xu) - Use modern cdev lifecycle for vfio_group (Jason Gunthorpe) - Use new mdev API in vfio_ccw (Jason Gunthorpe) * tag 'vfio-v5.16-rc1' of git://github.com/awilliam/linux-vfio: (27 commits) vfio/ccw: Convert to use vfio_register_emulated_iommu_dev() vfio/ccw: Pass vfio_ccw_private not mdev_device to various functions vfio/ccw: Use functions for alloc/free of the vfio_ccw_private vfio/ccw: Remove unneeded GFP_DMA vfio: Use cdev_device_add() instead of device_create() vfio: Use a refcount_t instead of a kref in the vfio_group vfio: Don't leak a group reference if the group already exists vfio: Do not open code the group list search in vfio_create_group() vfio: Delete vfio_get/put_group from vfio_iommu_group_notifier() vfio/pci: Add OpRegion 2.0+ Extended VBT support. vfio/iommu_type1: remove IS_IOMMU_CAP_DOMAIN_IN_CONTAINER vfio/iommu_type1: remove the "external" domain vfio/iommu_type1: initialize pgsize_bitmap in ->open vfio/spapr_tce: reject mediated devices vfio: clean up the check for mediated device in vfio_iommu_type1 vfio: remove the unused mdev iommu hook vfio: move the vfio_iommu_driver_ops interface out of <linux/vfio.h> vfio: remove unused method from vfio_iommu_driver_ops vfio: simplify iommu group allocation for mediated devices vfio: remove the iommudata hack for noiommu groups ...
2021-11-03nvdimm/btt: do not call del_gendisk() if not neededLuis Chamberlain
del_gendisk() should not called if the disk has not been added. Fix this. Fixes: 41cd8b70c37a ("libnvdimm, btt: add support for blk integrity") Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20211103165843.1402142-1-mcgrof@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-03Merge branch 'per_signal_struct_coredumps-for-v5.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull per signal_struct coredumps from Eric Biederman: "Current coredumps are mixed up with the exit code, the signal handling code, and the ptrace code making coredumps much more complicated than necessary and difficult to follow. This series of changes starts with ptrace_stop and cleans it up, making it easier to follow what is happening in ptrace_stop. Then cleans up the exec interactions with coredumps. Then cleans up the coredump interactions with exit. Finally the coredump interactions with the signal handling code is cleaned up. The first and last changes are bug fixes for minor bugs. I believe the fact that vfork followed by execve can kill the process the called vfork if exec fails is sufficient justification to change the userspace visible behavior. In previous discussions some of these changes were organized differently and individually appeared to make the code base worse. As currently written I believe they all stand on their own as cleanups and bug fixes. Which means that even if the worst should happen and the last change needs to be reverted for some unimaginable reason, the code base will still be improved. If the worst does not happen there are a more cleanups that can be made. Signals that generate coredumps can easily become eligible for short circuit delivery in complete_signal. The entire rendezvous for generating a coredump can move into get_signal. The function force_sig_info_to_task be written in a way that does not modify the signal handling state of the target task (because coredumps are eligible for short circuit delivery). Many of these future cleanups can be done another way but nothing so cleanly as if coredumps become per signal_struct" * 'per_signal_struct_coredumps-for-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: coredump: Limit coredumps to a single thread group coredump: Don't perform any cleanups before dumping core exit: Factor coredump_exit_mm out of exit_mm exec: Check for a pending fatal signal instead of core_state ptrace: Remove the unnecessary arguments from arch_ptrace_stop signal: Remove the bogus sigkill_pending in ptrace_stop
2021-11-03signal: Add SA_IMMUTABLE to ensure forced siganls do not get changedEric W. Biederman
As Andy pointed out that there are races between force_sig_info_to_task and sigaction[1] when force_sig_info_task. As Kees discovered[2] ptrace is also able to change these signals. In the case of seeccomp killing a process with a signal it is a security violation to allow the signal to be caught or manipulated. Solve this problem by introducing a new flag SA_IMMUTABLE that prevents sigaction and ptrace from modifying these forced signals. This flag is carefully made kernel internal so that no new ABI is introduced. Longer term I think this can be solved by guaranteeing short circuit delivery of signals in this case. Unfortunately reliable and guaranteed short circuit delivery of these signals is still a ways off from being implemented, tested, and merged. So I have implemented a much simpler alternative for now. [1] https://lkml.kernel.org/r/b5d52d25-7bde-4030-a7b1-7c6f8ab90660@www.fastmail.com [2] https://lkml.kernel.org/r/202110281136.5CE65399A7@keescook Cc: stable@vger.kernel.org Fixes: 307d522f5eb8 ("signal/seccomp: Refactor seccomp signal and coredump generation") Tested-by: Andrea Righi <andrea.righi@canonical.com> Tested-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-11-03PCI: j721e: Fix j721e_pcie_probe() error pathChristophe JAILLET
If an error occurs after a successful cdns_pcie_init_phy() call, it must be undone by a cdns_pcie_disable_phy() call, as already done above and below. Update the goto to branch at the correct place of the error handling path. Link: https://lore.kernel.org/r/db477b0cb444891a17c4bb424467667dc30d0bab.1624794264.git.christophe.jaillet@wanadoo.fr Fixes: 49e0efdce791 ("PCI: j721e: Add support to provide refclk to PCIe connector") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Krzysztof Wilczyński <kw@linux.com>
2021-11-03string: uninline memcpy_and_padGuenter Roeck
When building m68k:allmodconfig, recent versions of gcc generate the following error if the length of UTS_RELEASE is less than 8 bytes. In function 'memcpy_and_pad', inlined from 'nvmet_execute_disc_identify' at drivers/nvme/target/discovery.c:268:2: arch/m68k/include/asm/string.h:72:25: error: '__builtin_memcpy' reading 8 bytes from a region of size 7 Discussions around the problem suggest that this only happens if an architecture does not provide strlen(), if -ffreestanding is provided as compiler option, and if CONFIG_FORTIFY_SOURCE=n. All of this is the case for m68k. The exact reasons are unknown, but seem to be related to the ability of the compiler to evaluate the return value of strlen() and the resulting execution flow in memcpy_and_pad(). It would be possible to work around the problem by using sizeof(UTS_RELEASE) instead of strlen(UTS_RELEASE), but that would only postpone the problem until the function is called in a similar way. Uninline memcpy_and_pad() instead to solve the problem for good. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Andy Shevchenko <andriy.shevchenko@intel.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-03ACPI: video: use platform backlight driver on Xiaomi Mi Pad 2Hans de Goede
The Xiaomi Mi Pad 2 is a Cherry Trail based x86 tablet which does not use the i915's driver backlight control support: i915 0000:00:02.0: [drm:intel_panel_setup_backlight [i915]] no backlight present per VBT Like all Cherry Trail devices the ACPI tables on the Xiaomi Mi Pad 2 contain a broken ACPI-video implementation which causes 6 different acpi_video backlights to get registered when used. The lack of the i915 driver registering a BACKLIGHT_RAW (aka native) type backlight causes acpi_video_get_backlight_type() to pick the broken acpi_video backlight code as the backlight driver to use. There actually is a separate lp8556 backlight controller connected over I2C which gets registered as a BACKLIGHT_PLATFORM (aka vendor). Add a quirk to force acpi_video_get_backlight_type() to return acpi_backlight_vendor, so that the broken acpi_video backlight interfaces do not get registered. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-03ACPI: video: Drop dmi_system_id.ident settings from video_detect_dmi_table[]Hans de Goede
The .ident field of the dmi_system_id structs in the video_detect_dmi_table[] is not used by the code. Change all .ident = "..." assignments to comments, this reduces the size of video_detect.o / video.ko by about 1500 bytes. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-03ACPI: PMIC: Fix intel_pmic_regs_handler() read accessesHans de Goede
The handling of PMIC register reads through writing 0 to address 4 of the OpRegion is wrong. Instead of returning the read value through the value64, which is a no-op for function == ACPI_WRITE calls, store the value and then on a subsequent function == ACPI_READ with address == 3 (the address for the value field of the OpRegion) return the stored value. This has been tested on a Xiaomi Mi Pad 2 and makes the ACPI battery dev there mostly functional (unfortunately there are still other issues). Here are the SET() / GET() functions of the PMIC ACPI device, which use this OpRegion, which clearly show the new behavior to be correct: OperationRegion (REGS, 0x8F, Zero, 0x50) Field (REGS, ByteAcc, NoLock, Preserve) { CLNT, 8, SA, 8, OFF, 8, VAL, 8, RWM, 8 } Method (GET, 3, Serialized) { If ((AVBE == One)) { CLNT = Arg0 SA = Arg1 OFF = Arg2 RWM = Zero If ((AVBG == One)) { GPRW = Zero } } Return (VAL) /* \_SB_.PCI0.I2C7.PMI5.VAL_ */ } Method (SET, 4, Serialized) { If ((AVBE == One)) { CLNT = Arg0 SA = Arg1 OFF = Arg2 VAL = Arg3 RWM = One If ((AVBG == One)) { GPRW = One } } } Fixes: 0afa877a5650 ("ACPI / PMIC: intel: add REGS operation region support") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-03ACPI: EC: Remove initialization of static variables to falsewangzhitong
Remove the initialization of two static variables to false which is pointless. Signed-off-by: wangzhitong <wangzhitong@uniontech.com> [ rjw: Subject and changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-03ACPI: EC: Use ec_no_wakeup on HP ZHAN 66 ProBinbin Zhou
EC interrupts constantly wake up system from s2idle, so set ec_no_wakeup by default to keep the system in s2idle and reduce energy consumption. Signed-off-by: Binbin Zhou <zhoubinbin@uniontech.com> [ rjw: Changelog and subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-03at24: Support probing while in non-zero ACPI D stateSakari Ailus
In certain use cases (where the chip is part of a camera module, and the camera module is wired together with a camera privacy LED), powering on the device during probe is undesirable. Add support for the at24 to execute probe while being in ACPI D state other than 0 (which means fully powered on). Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org> Acked-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-03media: i2c: imx319: Support device probe in non-zero ACPI D stateRajmohan Mani
Tell ACPI device PM code that the driver supports the device being in non-zero ACPI D state when the driver's probe function is entered. Signed-off-by: Rajmohan Mani <rajmohan.mani@intel.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org> Reviewed-by: Bingbu Cao <bingbu.cao@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-03ACPI: Add a convenience function to tell a device is in D0 stateSakari Ailus
Add a convenience function to tell whether a device is in D0 state, primarily for use in drivers' probe or remove functions on busses where the custom is to power on the device for the duration of both. Returns false on non-ACPI systems. Suggested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-03Documentation: ACPI: Document _DSC object usage for enum power stateSakari Ailus
Document the use of the _DSC object for setting desirable power state during probe. Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-03i2c: Allow an ACPI driver to manage the device's power state during probeSakari Ailus
Enable drivers to tell ACPI that there's no need to power on a device for probe. Drivers should still perform this by themselves if there's a need to. In some cases powering on the device during probe is undesirable, and this change enables a driver to choose what fits best for it. Add a field called "flags" into struct i2c_driver for driver flags, and a flag I2C_DRV_ACPI_WAIVE_D0_PROBE to tell a driver supports probe in ACPI D states other than 0. Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Tomasz Figa <tfiga@chromium.org> Acked-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-03ACPI: scan: Obtain device's desired enumeration power stateSakari Ailus
Store a device's desired enumeration power state in struct acpi_device_power during acpi_device object's initialisation. Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-03kdb: Adopt scheduler's task classificationDaniel Thompson
Currently kdb contains some open-coded routines to generate a summary character for each task. This code currently issues warnings, is almost certainly broken and won't make sense to any kernel dev who has ever used /proc to examine task states. Fix both the warning and the potential for confusion by adopting the scheduler's task classification. Whilst doing this we also simplify the filtering by using mask strings directly (which means we don't have to guess all the characters the scheduler might give us). Unfortunately we can't quite match the scheduler classification completely. We add four extra states: - for idle loops and i, m and s for sleeping system daemons (which means kthreads in one of the I, M and S states). These extra states are used to manage the filters for tools to make the output of ps and bta less noisy. Note: The Fixes below is the last point the original dubious code was moved; it was not introduced by that patch. However it gives us the last point to which this patch can be easily backported. Happily that should be enough to cover the introduction of CONFIG_WERROR! Fixes: 2f064a59a11f ("sched: Change task_struct::state") Link: https://lore.kernel.org/r/20211102173158.3315227-1-daniel.thompson@linaro.org Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2021-11-03MIPS: Cobalt: Explain GT64111 early PCI fixupPali Rohár
Properly document why changing PCI Class Code for GT64111 device to Host Bridge is required as important details were after 20 years forgotten. Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2021-11-03Merge tag 'jfs-5.16' of git://github.com/kleikamp/linux-shaggyLinus Torvalds
Pull jfs fix from David Kleikamp: "Just one JFS patch" * tag 'jfs-5.16' of git://github.com/kleikamp/linux-shaggy: JFS: fix memleak in jfs_mount
2021-11-03Merge tag 'trace-v5.16-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull more tracing updates from Steven Rostedt: - osnoise and timerlat updates that will work with the RTLA tool (Real-Time Linux Analysis). Specifically it disconnects the work load (threads that look for latency) from the tracing instances attached to them, allowing for more than one instance to retrieve data from the work load. - Optimization on division in the trace histogram trigger code to use shift and multiply when possible. Also added documentation. - Fix prototype to my_direct_func in direct ftrace trampoline sample code. * tag 'trace-v5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ftrace/samples: Add missing prototype for my_direct_func tracing/selftests: Add tests for hist trigger expression parsing tracing/histogram: Document hist trigger variables tracing/histogram: Update division by 0 documentation tracing/histogram: Optimize division by constants tracing/osnoise: Remove PREEMPT_RT ifdefs from inside functions tracing/osnoise: Remove STACKTRACE ifdefs from inside functions tracing/osnoise: Allow multiple instances of the same tracer tracing/osnoise: Remove TIMERLAT ifdefs from inside functions tracing/osnoise: Support a list of trace_array *tr tracing/osnoise: Use start/stop_per_cpu_kthreads() on osnoise_cpus_write() tracing/osnoise: Split workload start from the tracer start tracing/osnoise: Improve comments about barrier need for NMI callbacks tracing/osnoise: Do not follow tracing_cpumask
2021-11-03MAINTAINERS: Update BCM7XXX entry with additional patternsFlorian Fainelli
Broadcom STB systems use the bcm7038 pattern as well as the bcm7120 pattern for some of its drivers, add those to the existing entry. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20211028163756.4014059-1-f.fainelli@gmail.com' Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-11-03blk-mq: update hctx->nr_active in blk_mq_end_request_batch()Ming Lei
In case of shared tags and none io sched, batched completion still may be run into, and hctx->nr_active is accounted when getting driver tag, so it has to be updated in blk_mq_end_request_batch(). Otherwise, hctx->nr_active may become same with queue depth, then hctx_may_queue() always return false, then io hang is caused. Fixes the issue by updating the counter in batched way. Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Fixes: f794f3351f26 ("block: add support for blk_mq_end_request_batch()") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20211102153619.3627505-4-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-03blk-mq: add RQF_ELV debug entryMing Lei
Looks it is missed so add it. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20211102133502.3619184-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-03blk-mq: only try to run plug merge if request has same queue with incoming bioMing Lei
It is obvious that io merge can't be done between two different queues, so just try to run io merge in case of same queue. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20211102133502.3619184-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-03block: move RQF_ELV setting into allocatorsJens Axboe
It's not safe to do this before blk_queue_enter(), as the scheduler state could have changed in between. Hence move the RQF_ELV setting into the allocators, where we know the queue is already entered. Suggested-by: Ming Lei <ming.lei@redhat.com> Reported-by: Yi Zhang <yi.zhang@redhat.com> Reported-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-11-03ice: Fix race conditions between virtchnl handling and VF ndo opsBrett Creeley
The VF can be configured via the PF's ndo ops at the same time the PF is receiving/handling virtchnl messages. This has many issues, with one of them being the ndo op could be actively resetting a VF (i.e. resetting it to the default state and deleting/re-adding the VF's VSI) while a virtchnl message is being handled. The following error was seen because a VF ndo op was used to change a VF's trust setting while the VIRTCHNL_OP_CONFIG_VSI_QUEUES was ongoing: [35274.192484] ice 0000:88:00.0: Failed to set LAN Tx queue context, error: ICE_ERR_PARAM [35274.193074] ice 0000:88:00.0: VF 0 failed opcode 6, retval: -5 [35274.193640] iavf 0000:88:01.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6 Fix this by making sure the virtchnl handling and VF ndo ops that trigger VF resets cannot run concurrently. This is done by adding a struct mutex cfg_lock to each VF structure. For VF ndo ops, the mutex will be locked around the critical operations and VFR. Since the ndo ops will trigger a VFR, the virtchnl thread will use mutex_trylock(). This is done because if any other thread (i.e. VF ndo op) has the mutex, then that means the current VF message being handled is no longer valid, so just ignore it. This issue can be seen using the following commands: for i in {0..50}; do rmmod ice modprobe ice sleep 1 echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs ip link set ens785f1 vf 0 trust on ip link set ens785f0 vf 0 trust on sleep 2 echo 0 > /sys/class/net/ens785f0/device/sriov_numvfs echo 0 > /sys/class/net/ens785f1/device/sriov_numvfs sleep 1 echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs ip link set ens785f1 vf 0 trust on ip link set ens785f0 vf 0 trust on done Fixes: 7c710869d64e ("ice: Add handlers for VF netdevice operations") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-03ice: Fix not stopping Tx queues for VFsBrett Creeley
When a VF is removed and/or reset its Tx queues need to be stopped from the PF. This is done by calling the ice_dis_vf_qs() function, which calls ice_vsi_stop_lan_tx_rings(). Currently ice_dis_vf_qs() is protected by the VF state bit ICE_VF_STATE_QS_ENA. Unfortunately, this is causing the Tx queues to not be disabled in some cases and when the VF tries to re-enable/reconfigure its Tx queues over virtchnl the op is failing. This is because a VF can be reset and/or removed before the ICE_VF_STATE_QS_ENA bit is set, but the Tx queues were already configured via ice_vsi_cfg_single_txq() in the VIRTCHNL_OP_CONFIG_VSI_QUEUES op. However, the ICE_VF_STATE_QS_ENA bit is set on a successful VIRTCHNL_OP_ENABLE_QUEUES, which will always happen after the VIRTCHNL_OP_CONFIG_VSI_QUEUES op. This was causing the following error message when loading the ice driver, creating VFs, and modifying VF trust in an endless loop: [35274.192484] ice 0000:88:00.0: Failed to set LAN Tx queue context, error: ICE_ERR_PARAM [35274.193074] ice 0000:88:00.0: VF 0 failed opcode 6, retval: -5 [35274.193640] iavf 0000:88:01.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6 Fix this by always calling ice_dis_vf_qs() and silencing the error message in ice_vsi_stop_tx_ring() since the calling code ignores the return anyway. Also, all other places that call ice_vsi_stop_tx_ring() catch the error, so this doesn't affect those flows since there was no change to the values the function returns. Other solutions were considered (i.e. tracking which VF queues had been "started/configured" in VIRTCHNL_OP_CONFIG_VSI_QUEUES, but it seemed more complicated than it was worth. This solution also brings in the chance for other unexpected conditions due to invalid state bit checks. So, the proposed solution seemed like the best option since there is no harm in failing to stop Tx queues that were never started. This issue can be seen using the following commands: for i in {0..50}; do rmmod ice modprobe ice sleep 1 echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs ip link set ens785f1 vf 0 trust on ip link set ens785f0 vf 0 trust on sleep 2 echo 0 > /sys/class/net/ens785f0/device/sriov_numvfs echo 0 > /sys/class/net/ens785f1/device/sriov_numvfs sleep 1 echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs ip link set ens785f1 vf 0 trust on ip link set ens785f0 vf 0 trust on done Fixes: 77ca27c41705 ("ice: add support for virtchnl_queue_select.[tx|rx]_queues bitmap") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>