summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-05-19KVM: VMX: Fix header file dependency of asm/vmx.hJacob Xu
Include a definition of WARN_ON_ONCE() before using it. Fixes: bb1fcc70d98f ("KVM: nVMX: Allow L1 to use 5-level page walks for nested EPT") Cc: Sean Christopherson <seanjc@google.com> Signed-off-by: Jacob Xu <jacobhxu@google.com> [reworded commit message; changed <asm/bug.h> to <linux/bug.h>] Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220225012959.1554168-1-jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-05-19KVM: Don't enable hardware after a restart/shutdown is initiatedSean Christopherson
Reject hardware enabling, i.e. VM creation, if a restart/shutdown has been initiated to avoid re-enabling hardware between kvm_reboot() and machine_{halt,power_off,restart}(). The restart case is especially problematic (for x86) as enabling VMX (or clearing GIF in KVM_RUN on SVM) blocks INIT, which results in the restart/reboot hanging as BIOS is unable to wake and rendezvous with APs. Note, this bug, and the original issue that motivated the addition of kvm_reboot(), is effectively limited to a forced reboot, e.g. `reboot -f`. In a "normal" reboot, userspace will gracefully teardown userspace before triggering the kernel reboot (modulo bugs, errors, etc), i.e. any process that might do ioctl(KVM_CREATE_VM) is long gone. Fixes: 8e1c18157d87 ("KVM: VMX: Disable VMX when system shutdown") Signed-off-by: Sean Christopherson <seanjc@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Message-Id: <20230512233127.804012-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-05-19KVM: Use syscore_ops instead of reboot_notifier to hook restart/shutdownSean Christopherson
Use syscore_ops.shutdown to disable hardware virtualization during a reboot instead of using the dedicated reboot_notifier so that KVM disables virtualization _after_ system_state has been updated. This will allow fixing a race in KVM's handling of a forced reboot where KVM can end up enabling hardware virtualization between kernel_restart_prepare() and machine_restart(). Rename KVM's hook to match the syscore op to avoid any possible confusion from wiring up a "reboot" helper to a "shutdown" hook (neither "shutdown nor "reboot" is completely accurate as the hook handles both). Opportunistically rewrite kvm_shutdown()'s comment to make it less VMX specific, and to explain why kvm_rebooting exists. Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Zenghui Yu <yuzenghui@huawei.com> Cc: kvmarm@lists.linux.dev Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com> Cc: Anup Patel <anup@brainfault.org> Cc: Atish Patra <atishp@atishpatra.org> Cc: kvm-riscv@lists.infradead.org Signed-off-by: Sean Christopherson <seanjc@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Message-Id: <20230512233127.804012-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-05-19Merge tag 'sound-6.4-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of small fixes that have been gathered since rc1: - Lots of small ASoC SOF Intel fixes - A couple of UAF and NULL-dereference fixes - Quirks and updates for HD-audio, USB-audio and ASoC AMD - A few minor build / sparse warning fixes - MAINTAINERS and DT updates" * tag 'sound-6.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (38 commits) ALSA: hda: Add NVIDIA codec IDs a3 through a7 to patch table ALSA: oss: avoid missing-prototype warnings ALSA: cs46xx: mark snd_cs46xx_download_image as static ALSA: hda: Fix Oops by 9.1 surround channel names ASoC: SOF: topology: Fix tuples array allocation ASoC: SOF: Separate the tokens for input and output pin index MAINTAINERS: Remove self from Cirrus Codec drivers ASoC: cs35l56: Prevent unbalanced pm_runtime in dsp_work() on SoundWire ASoC: SOF: topology: Fix logic for copying tuples ASoC: SOF: pm: save io region state in case of errors in resume ASoC: MAINTAINERS: drop Krzysztof Kozlowski from Samsung audio ASoC: mediatek: mt8186: Fix use-after-free in driver remove path ASoC: SOF: ipc3-topology: Make sure that only one cmd is sent in dai_config ASoC: SOF: sof-client-probes: fix pm_runtime imbalance in error handling ASoC: SOF: pcm: fix pm_runtime imbalance in error handling ASoC: SOF: debug: conditionally bump runtime_pm counter on exceptions ASoC: SOF: Intel: hda-mlink: add helper to program SoundWire PCMSyCM registers ASoC: SOF: Intel: hda-mlink: initialize instance_offset member ASoC: SOF: Intel: hda-mlink: use 'ml_addr' parameter consistently ASoC: SOF: Intel: hda-mlink: fix base_ptr computation ...
2023-05-19perf bench syscall: Fix __NR_execve undeclared build errorTiezhu Yang
The __NR_execve definition for i386 was deleted by mistake in the commit ece7f7c0507c ("perf bench syscall: Add fork syscall benchmark"), add it to fix the build error on i386. Fixes: ece7f7c0507cc147 ("perf bench syscall: Add fork syscall benchmark") Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: loongson-kernel@lists.loongnix.cn Closes: https://lore.kernel.org/all/CA+G9fYvgBR1iB0CorM8OC4AM_w_tFzyQKHc+rF6qPzJL=TbfDQ@mail.gmail.com/ Link: https://lore.kernel.org/r/1684480657-2375-1-git-send-email-yangtiezhu@loongson.cn Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-05-19Merge branch 'pm-tools'Rafael J. Wysocki
Merge cpupower utility fixes for 6.4-rc3: - Read TSC on each CPU right before reading MPERF so as to reduce the potential time difference between the TSC and MPERF accesses and improve the C0 percentage calculation (Wyes Karny). - Fix a possible file handle leak and clean up the code in sysfs_get_enabled() (Hao Zeng). * pm-tools: cpupower: Make TSC read per CPU for Mperf monitor cpupower:Fix resource leaks in sysfs_get_enabled()
2023-05-19Merge tag 'linux-cpupower-6.4-rc3' of ↵Rafael J. Wysocki
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux Pull cpupower utility fixes for 6.4-rc3 from Shuah Khan: "This cpupower fixes update for Linux 67.4-rc3 consists of: - a resource leak fix - fix drift in C0 percentage calculation due to System-wide TSC read. To lower this drift read TSC per CPU and also just after mperf read. This technique improves C0 percentage calculation in Mperf monitor" * tag 'linux-cpupower-6.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux: cpupower: Make TSC read per CPU for Mperf monitor cpupower:Fix resource leaks in sysfs_get_enabled()
2023-05-19fbdev: omapfb: panel-tpo-td043mtea1: fix error code in probe()Dan Carpenter
This was using the wrong variable, "r", instead of "ddata->vcc_reg", so it returned success instead of a negative error code. Fixes: 0d3dbeb8142a ("video: fbdev: omapfb: panel-tpo-td043mtea1: Make use of the helper function dev_err_probe()") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Helge Deller <deller@gmx.de>
2023-05-19perf test attr: Fix python SafeConfigParser() deprecation warningIan Rogers
Address the warning: ``` tests/attr.py:155: DeprecationWarning: The SafeConfigParser class has been renamed to ConfigParser in Python 3.2. This alias will be removed in Python 3.12. Use ConfigParser directly instead. parser = configparser.SafeConfigParser() ``` by removing the word 'Safe'. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20230517225707.2682235-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-05-19perf test attr: Update no event/metric expectationsIan Rogers
Previously hard coded events/metrics were used, update for the use of the TopdownL1 json metric group. Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Fixes: 94b1a603fca78388 ("perf stat: Add TopdownL1 metric as a default if present") Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20230517225707.2682235-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-05-19driver core: class: properly reference count class_dev_iter()Greg Kroah-Hartman
When class_dev_iter is initialized, the reference count for the subsys private structure is incremented, but never decremented, causing a memory leak over time. To resolve this, save off a pointer to the internal structure into the class_dev_iter structure and then when the iterator is finished, drop the reference count. Reported-and-tested-by: syzbot+e7afd76ad060fa0d2605@syzkaller.appspotmail.com Fixes: 7b884b7f24b4 ("driver core: class.c: convert to only use class_to_subsys") Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Cc: Alan Stern <stern@rowland.harvard.edu> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Tested-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr> Link: https://lore.kernel.org/r/2023051610-stove-condense-9a77@gregkh Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-18Merge tag 'nvme-6.4-2023-05-18' of git://git.infradead.org/nvme into block-6.4Jens Axboe
Pull NVMe fixes from Keith: "nvme fixes for Linux 6.4 - More device quirks (Sagi, Hristo, Adrian, Daniel) - Controller delete race (Maurizo) - Multipath cleanup fix (Christoph)" * tag 'nvme-6.4-2023-05-18' of git://git.infradead.org/nvme: nvme-pci: Add quirk for Teamgroup MP33 SSD nvme: do not let the user delete a ctrl before a complete initialization nvme-multipath: don't call blk_mark_disk_dead in nvme_mpath_remove_disk nvme-pci: clamp max_hw_sectors based on DMA optimized limitation nvme-pci: add quirk for missing secondary temperature thresholds nvme-pci: add NVME_QUIRK_BOGUS_NID for HS-SSD-FUTURE 2048G
2023-05-19Merge tag 'amd-drm-fixes-6.4-2023-05-18' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.4-2023-05-18: amdgpu: - update gfx11 clock counter logic - Fix a race when disabling gfxoff on gfx10/11 for profiling - Raven/Raven2/PCO clock counter fix - Add missing get_vbios_fb_size for GMC 11 - Fix a spurious irq warning in the device remove case - Fix possible power mode mismatch between driver and PMFW - USB4 fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230518174811.3841-1-alexander.deucher@amd.com
2023-05-19Merge tag 'drm-msm-fixes-2023-05-17' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/msm into drm-fixes msm-fixes for v6.4-rc3 Display Fixes: + Catalog fixes: - fix the programmable fetch lines and qos settings of msm8998 to match what is present downstream - fix the LM pairs for msm8998 to match what is present downstream. The current settings are not right as LMs with incompatible connected blocks are paired - remove unused INTF0 interrupt mask from SM6115/QCM2290 as there is no INTF0 present on those chipsets. There is only one DSI on index 1 - remove TE2 block from relevant chipsets because this is mainly used for ping-pong split feature which is not supported upstream and also for the chipsets where we are removing them in this change, that block is not present as the tear check has been moved to the intf block - relocate non-MDP_TOP INTF_INTR offsets from dpu_hwio.h to dpu_hw_interrupts.c to match where they belong - fix the indentation for REV_7xxx interrupt masks - fix the offset and version for dither blocks of SM8[34]50/SC8280XP chipsets as it was incorrect - make the ping-pong blk length 0 for appropriate chipsets as those chipsets only have a dither ping-pong dither block but no other functionality in the base ping-pong - remove some duplicate register defines from INTF + Fix the log mask for the writeback block so that it can be enabled correctly via debugfs + unregister the hdmi codec for dp during unbind otherwise it leaks audio codec devices + Yaml change to fix warnings related to 'qcom,master-dsi' and 'qcom,sync-dual-dsi' GPU Fixes: + fix submit error path leak + arm-smmu-qcom fix for regression that broke per-process page tables + fix no-iommu crash Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvHEcJfp=k6qatmb_SvAeyvy3CBpaPfwLqtNthuEzA_7w@mail.gmail.com
2023-05-19Merge tag 'drm-intel-fixes-2023-05-17' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes Add missing null check for HDCP code. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZGUgi7kXq+MiLcCA@jlahtine-mobl.ger.corp.intel.com
2023-05-18nvme-pci: Add quirk for Teamgroup MP33 SSDDaniel Smith
Add a quirk for Teamgroup MP33 that reports duplicate ids for disk. Signed-off-by: Daniel Smith <dansmith@ds.gy> [kch: patch formatting] Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com> Tested-by: Daniel Smith <dansmith@ds.gy> Signed-off-by: Keith Busch <kbusch@kernel.org>
2023-05-19Merge tag 'exynos-drm-fixes-for-v6.4-rc3' of ↵Dave Airlie
git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes Just one fixup to exynos_drm_g2d module. - Fix below build warning by marking them as 'static inline' drivers/gpu/drm/exynos/exynos_drm_g2d.h:37:5: error: no previous prototype for 'g2d_open' drivers/gpu/drm/exynos/exynos_drm_g2d.h:42:5: error: no previous prototype for 'g2d_close' Signed-off-by: Dave Airlie <airlied@redhat.com> From: Inki Dae <inki.dae@samsung.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230515083943.43933-1-inki.dae@samsung.com
2023-05-18Merge tag 'mm-hotfixes-stable-2023-05-18-15-52' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "Eight hotfixes. Four are cc:stable, the other four are for post-6.4 issues, or aren't considered suitable for backporting" * tag 'mm-hotfixes-stable-2023-05-18-15-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: MAINTAINERS: Cleanup Arm Display IP maintainers MAINTAINERS: repair pattern in DIALOG SEMICONDUCTOR DRIVERS nilfs2: fix use-after-free bug of nilfs_root in nilfs_evict_inode() mm: fix zswap writeback race condition mm: kfence: fix false positives on big endian zsmalloc: move LRU update from zs_map_object() to zs_malloc() mm: shrinkers: fix race condition on debugfs cleanup maple_tree: make maple state reusable after mas_empty_area()
2023-05-18Merge tag 'probes-fixes-v6.4-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull probes fixes from Masami Hiramatsu: - Initialize 'ret' local variables on fprobe_handler() to fix the smatch warning. With this, fprobe function exit handler is not working randomly. - Fix to use preempt_enable/disable_notrace for rethook handler to prevent recursive call of fprobe exit handler (which is based on rethook) - Fix recursive call issue on fprobe_kprobe_handler() - Fix to detect recursive call on fprobe_exit_handler() - Fix to make all arch-dependent rethook code notrace (the arch-independent code is already notrace)" * tag 'probes-fixes-v6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: rethook, fprobe: do not trace rethook related functions fprobe: add recursion detection in fprobe_exit_handler fprobe: make fprobe_kprobe_handler recursion free rethook: use preempt_{disable, enable}_notrace in rethook_trampoline_handler tracing: fprobe: Initialize ret valiable to fix smatch error
2023-05-18Merge tag 'net-6.4-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from can, xfrm, bluetooth and netfilter. Current release - regressions: - ipv6: fix RCU splat in ipv6_route_seq_show() - wifi: iwlwifi: disable RFI feature Previous releases - regressions: - tcp: fix possible sk_priority leak in tcp_v4_send_reset() - tipc: do not update mtu if msg_max is too small in mtu negotiation - netfilter: fix null deref on element insertion - devlink: change per-devlink netdev notifier to static one - phylink: fix ksettings_set() ethtool call - wifi: mac80211: fortify the spinlock against deadlock by interrupt - wifi: brcmfmac: check for probe() id argument being NULL - eth: ice: - fix undersized tx_flags variable - fix ice VF reset during iavf initialization - eth: hns3: fix sending pfc frames after reset issue Previous releases - always broken: - xfrm: release all offloaded policy memory - nsh: use correct mac_offset to unwind gso skb in nsh_gso_segment() - vsock: avoid to close connected socket after the timeout - dsa: rzn1-a5psw: enable management frames for CPU port - eth: virtio_net: fix error unwinding of XDP initialization - eth: tun: fix memory leak for detached NAPI queue. Misc: - MAINTAINERS: sctp: move Neil to CREDITS" * tag 'net-6.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (107 commits) MAINTAINERS: skip CCing netdev for Bluetooth patches mdio_bus: unhide mdio_bus_init prototype bridge: always declare tunnel functions atm: hide unused procfs functions net: isa: include net/Space.h Revert "ARM: dts: stm32: add CAN support on stm32f746" netfilter: nft_set_rbtree: fix null deref on element insertion netfilter: nf_tables: fix nft_trans type confusion netfilter: conntrack: define variables exp_nat_nla_policy and any_addr with CONFIG_NF_NAT net: wwan: t7xx: Ensure init is completed before system sleep net: selftests: Fix optstring net: pcs: xpcs: fix C73 AN not getting enabled net: wwan: iosm: fix NULL pointer dereference when removing device vlan: fix a potential uninit-value in vlan_dev_hard_start_xmit() mailmap: add entries for Nikolay Aleksandrov igb: fix bit_shift to be in [1..8] range net: dsa: mv88e6xxx: Fix mv88e6393x EPC write command offset cassini: Fix a memory leak in the error handling path of cas_init_one() tun: Fix memory leak for detached NAPI queue. can: kvaser_pciefd: Disable interrupts in probe error path ...
2023-05-18Merge tag 'media/v6.4-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: "Several fixes for the dvb core and drivers: - fix UAF and null pointer de-reference in DVB core - fix kernel runtime warning for blocking operation in wait_event*() in dvb core - fix write size bug in DVB conditional access core - fix dvb demux continuity counter debug check logic - randconfig build fixes in pvrusb2 and mn88443x - fix memory leak in ttusb-dec - fix netup_unidvb probe-time error check logic - improve error handling in dw2102 if it can't retrieve DVB MAC address" * tag 'media/v6.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: dvb-core: Fix use-after-free due to race condition at dvb_ca_en50221 media: dvb-core: Fix kernel WARNING for blocking operation in wait_event*() media: dvb-core: Fix use-after-free due to race at dvb_register_device() media: dvb-core: Fix use-after-free due on race condition at dvb_net media: dvb-core: Fix use-after-free on race condition at dvb_frontend media: mn88443x: fix !CONFIG_OF error by drop of_match_ptr from ID table media: ttusb-dec: fix memory leak in ttusb_dec_exit_dvb() media: dvb_ca_en50221: fix a size write bug media: netup_unidvb: fix irq init by register it at the end of probe media: dvb-usb: dw2102: fix uninit-value in su3000_read_mac_address media: dvb-usb: digitv: fix null-ptr-deref in digitv_i2c_xfer() media: dvb-usb-v2: rtl28xxu: fix null-ptr-deref in rtl28xxu_i2c_xfer media: dvb-usb-v2: ce6230: fix null-ptr-deref in ce6230_i2c_master_xfer() media: dvb-usb-v2: ec168: fix null-ptr-deref in ec168_i2c_xfer() media: dvb-usb: az6027: fix three null-ptr-deref in az6027_i2c_xfer() media: netup_unidvb: fix use-after-free at del_timer() media: dvb_demux: fix a bug for the continuity counter media: pvrusb2: fix DVB_CORE dependency
2023-05-18ublk: fix AB-BA lockdep warningMing Lei
When handling UBLK_IO_FETCH_REQ, ctx->uring_lock is grabbed first, then ub->mutex is acquired. When handling UBLK_CMD_STOP_DEV or UBLK_CMD_DEL_DEV, ub->mutex is grabbed first, then calling io_uring_cmd_done() for canceling uring command, in which ctx->uring_lock may be required. Real deadlock only happens when all the above commands are issued from same uring context, and in reality different uring contexts are often used for handing control command and IO command. Fix the issue by using io_uring_cmd_complete_in_task() to cancel command in ublk_cancel_dev(ublk_cancel_queue). Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Closes: https://lore.kernel.org/linux-block/becol2g7sawl4rsjq2dztsbc7mqypfqko6wzsyoyazqydoasml@rcxarzwidrhk Cc: Ziyang Zhang <ZiyangZhang@linux.alibaba.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Tested-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> Link: https://lore.kernel.org/r/20230517133408.210944-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-05-18drm/amd/display: enable dpia validateMustapha Ghaddar
Use dpia_validate_usb4_bw() function Fixes: a8b537605e22 ("drm/amd/display: Add function pointer for validate bw usb4") Reviewed-by: Roman Li <roman.li@amd.com> Reviewed-by: Meenakshikumar Somasundaram <meenakshikumar.somasundaram@amd.com> Acked-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Mustapha Ghaddar <mghaddar@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-05-18drm/amd/pm: fix possible power mode mismatch between driver and PMFWEvan Quan
PMFW may boots the ASIC with a different power mode from the system's real one. Notify PMFW explicitly the power mode the system in. This is needed only when ACDC switch via gpio is not supported. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-05-18drm/amdgpu: skip disabling fence driver src_irqs when device is unpluggedGuchun Chen
When performing device unbind or halt, we have disabled all irqs at the very begining like amdgpu_pci_remove or amdgpu_device_halt. So amdgpu_irq_put for irqs stored in fence driver should not be called any more, otherwise, below calltrace will arrive. [ 139.114088] WARNING: CPU: 2 PID: 1550 at drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:616 amdgpu_irq_put+0xf6/0x110 [amdgpu] [ 139.114655] Call Trace: [ 139.114655] <TASK> [ 139.114657] amdgpu_fence_driver_hw_fini+0x93/0x130 [amdgpu] [ 139.114836] amdgpu_device_fini_hw+0xb6/0x350 [amdgpu] [ 139.114955] amdgpu_driver_unload_kms+0x51/0x70 [amdgpu] [ 139.115075] amdgpu_pci_remove+0x63/0x160 [amdgpu] [ 139.115193] ? __pm_runtime_resume+0x64/0x90 [ 139.115195] pci_device_remove+0x3a/0xb0 [ 139.115197] device_remove+0x43/0x70 [ 139.115198] device_release_driver_internal+0xbd/0x140 Signed-off-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-05-18drm/amdgpu/gmc11: implement get_vbios_fb_size()Alex Deucher
Implement get_vbios_fb_size() so we can properly reserve the vbios splash screen to avoid potential artifacts on the screen during the transition from the pre-OS console to the OS console. Acked-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-05-18drm/amdgpu: Differentiate between Raven2 and Raven/Picasso according to ↵Jesse Zhang
revision id Due to the raven2 and raven/picasso maybe have the same GC_HWIP version. So differentiate them by revision id. Signed-off-by: shanshengwang <shansheng.wang@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-05-18drm/amdgpu/gfx11: Adjust gfxoff before powergating on gfx11 as wellGuilherme G. Piccoli
(Bas: speculative change to mirror gfx10/gfx9) Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-05-18drm/amdgpu/gfx10: Disable gfxoff before disabling powergating.Bas Nieuwenhuizen
Otherwise we get a full system lock (looks like a FW mess). Copied the order from the GFX9 powergating code. Fixes: 366468ff6c34 ("drm/amdgpu: Allow GfxOff on Vangogh as default") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2545 Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-05-18ceph: force updating the msg pointer in non-split caseXiubo Li
When the MClientSnap reqeust's op is not CEPH_SNAP_OP_SPLIT the request may still contain a list of 'split_realms', and we need to skip it anyway. Or it will be parsed as a corrupt snaptrace. Cc: stable@vger.kernel.org Link: https://tracker.ceph.com/issues/61200 Reported-by: Frank Schilder <frans@dtu.dk> Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2023-05-18ceph: silence smatch warning in reconnect_caps_cb()Xiubo Li
Smatch static checker warning: fs/ceph/mds_client.c:3968 reconnect_caps_cb() warn: missing error code here? '__get_cap_for_mds()' failed. 'err' = '0' [ idryomov: Dan says that Smatch considers it intentional only if the "ret = 0;" assignment is within 4 or 5 lines of the goto. ] Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2023-05-18Merge tag 'linux-can-fixes-for-6.4-20230518' of ↵Paolo Abeni
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2023-05-18 this is a pull request of 7 patches for net/master. The first 6 patches are by Jimmy Assarsson and fix several bugs in the kvaser_pciefd driver. The latest patch is from me and reverts a change in stm32f746.dtsi that causes build errors due to a missing dependent patch. * tag 'linux-can-fixes-for-6.4-20230518' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: Revert "ARM: dts: stm32: add CAN support on stm32f746" can: kvaser_pciefd: Disable interrupts in probe error path can: kvaser_pciefd: Do not send EFLUSH command on TFD interrupt can: kvaser_pciefd: Empty SRB buffer in probe can: kvaser_pciefd: Call request_irq() before enabling interrupts can: kvaser_pciefd: Clear listen-only bit if not explicitly requested can: kvaser_pciefd: Set CAN_STATE_STOPPED in kvaser_pciefd_stop() ==================== Link: https://lore.kernel.org/r/20230518073241.1110453-1-mkl@pengutronix.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-17Merge tag 'nf-23-05-17' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Florian Westphal says: ==================== Netfilter fixes for net 1. Silence warning about unused variable when CONFIG_NF_NAT=n, from Tom Rix. 2. nftables: Fix possible out-of-bounds access, from myself. 3. nftables: fix null deref+UAF during element insertion into rbtree, also from myself. * tag 'nf-23-05-17' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_set_rbtree: fix null deref on element insertion netfilter: nf_tables: fix nft_trans type confusion netfilter: conntrack: define variables exp_nat_nla_policy and any_addr with CONFIG_NF_NAT ==================== Link: https://lore.kernel.org/r/20230517123756.7353-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-17Merge tag 'wireless-2023-05-17' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Kalle Valo says: ==================== wireless fixes for v6.4 A lot of fixes this time, for both the stack and the drivers. The brcmfmac resume fix has been reported by several people so I would say it's the most important here. The iwlwifi RFI workaround is also something which was reported as a regression recently. * tag 'wireless-2023-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: (31 commits) wifi: b43: fix incorrect __packed annotation wifi: rtw88: sdio: Always use two consecutive bytes for word operations mac80211_hwsim: fix memory leak in hwsim_new_radio_nl wifi: iwlwifi: mvm: Add locking to the rate read flow wifi: iwlwifi: Don't use valid_links to iterate sta links wifi: iwlwifi: mvm: don't trust firmware n_channels wifi: iwlwifi: mvm: fix OEM's name in the tas approved list wifi: iwlwifi: fix OEM's name in the ppag approved list wifi: iwlwifi: mvm: fix initialization of a return value wifi: iwlwifi: mvm: fix access to fw_id_to_mac_id wifi: iwlwifi: fw: fix DBGI dump wifi: iwlwifi: mvm: fix number of concurrent link checks wifi: iwlwifi: mvm: fix cancel_delayed_work_sync() deadlock wifi: iwlwifi: mvm: don't double-init spinlock wifi: iwlwifi: mvm: always free dup_data wifi: mac80211: recalc chanctx mindef before assigning wifi: mac80211: consider reserved chanctx for mindef wifi: mac80211: simplify chanctx allocation wifi: mac80211: Abort running color change when stopping the AP wifi: mac80211: fix min center freq offset tracing ... ==================== Link: https://lore.kernel.org/r/20230517151914.B0AF6C433EF@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-17MAINTAINERS: skip CCing netdev for Bluetooth patchesJakub Kicinski
As requested by Marcel skip netdev for Bluetooth patches. Bluetooth has its own mailing list and overloading netdev leads to fewer people reading it. Link: https://lore.kernel.org/netdev/639C8EA4-1F6E-42BE-8F04-E4A753A6EFFC@holtmann.org/ Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230517014253.1233333-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-17mdio_bus: unhide mdio_bus_init prototypeArnd Bergmann
mdio_bus_init() is either used as a local module_init() entry, or it gets called in phy_device.c. In the former case, there is no declaration, which causes a warning: drivers/net/phy/mdio_bus.c:1371:12: error: no previous prototype for 'mdio_bus_init' [-Werror=missing-prototypes] Remove the #ifdef around the declaration to avoid the warning.. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20230516194625.549249-4-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-17bridge: always declare tunnel functionsArnd Bergmann
When CONFIG_BRIDGE_VLAN_FILTERING is disabled, two functions are still defined but have no prototype or caller. This causes a W=1 warning for the missing prototypes: net/bridge/br_netlink_tunnel.c:29:6: error: no previous prototype for 'vlan_tunid_inrange' [-Werror=missing-prototypes] net/bridge/br_netlink_tunnel.c:199:5: error: no previous prototype for 'br_vlan_tunnel_info' [-Werror=missing-prototypes] The functions are already contitional on CONFIG_BRIDGE_VLAN_FILTERING, and I coulnd't easily figure out the right set of #ifdefs, so just move the declarations out of the #ifdef to avoid the warning, at a small cost in code size over a more elaborate fix. Fixes: 188c67dd1906 ("net: bridge: vlan options: add support for tunnel id dumping") Fixes: 569da0822808 ("net: bridge: vlan options: add support for tunnel mapping set/del") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20230516194625.549249-3-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-17atm: hide unused procfs functionsArnd Bergmann
When CONFIG_PROC_FS is disabled, the function declarations for some procfs functions are hidden, but the definitions are still build, as shown by this compiler warning: net/atm/resources.c:403:7: error: no previous prototype for 'atm_dev_seq_start' [-Werror=missing-prototypes] net/atm/resources.c:409:6: error: no previous prototype for 'atm_dev_seq_stop' [-Werror=missing-prototypes] net/atm/resources.c:414:7: error: no previous prototype for 'atm_dev_seq_next' [-Werror=missing-prototypes] Add another #ifdef to leave these out of the build. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20230516194625.549249-2-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-17net: isa: include net/Space.hArnd Bergmann
The legacy drivers that still get called from net/Space.c have prototypes in net/Space, but this header is not included in most of the files that define those functions: drivers/net/ethernet/cirrus/cs89x0.c:1649:28: error: no previous prototype for 'cs89x0_probe' [-Werror=missing-prototypes] drivers/net/ethernet/8390/ne.c:947:28: error: no previous prototype for 'ne_probe' [-Werror=missing-prototypes] drivers/net/ethernet/8390/smc-ultra.c:167:28: error: no previous prototype for 'ultra_probe' [-Werror=missing-prototypes] drivers/net/ethernet/amd/lance.c:438:28: error: no previous prototype for 'lance_probe' [-Werror=missing-prototypes] drivers/net/ethernet/3com/3c515.c:422:20: error: no previous prototype for 'tc515_probe' [-Werror=missing-prototypes] Add the inclusion to avoids the warnings. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20230516194625.549249-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-17MAINTAINERS: Cleanup Arm Display IP maintainersLiviu Dudau
Some people have moved to different roles and are no longer involved in the upstream development. As there is only one person left, remove the mailing list as well as it serves no purpose. Link: https://lkml.kernel.org/r/20230510122811.1872358-1-liviu.dudau@arm.com Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Acked-by: Brian Starkey <brian.starkey@arm.com> Cc: Mihail Atanassov <mihail.atanassov@arm.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Joe Perches <joe@perches.com> # "Please use --order" Cc: Mihail Atanassov <mihail.atanassov@arm.com> Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-17MAINTAINERS: repair pattern in DIALOG SEMICONDUCTOR DRIVERSLukas Bulwahn
Commit 361104b05684c ("dt-bindings: mfd: Convert da9063 to yaml") converts da9063.txt to dlg,da9063.yaml and adds a new file pattern in MAINTAINERS. Unfortunately, the file pattern matches da90*.yaml, but the yaml file is prefixed with dlg,da90. Hence, ./scripts/get_maintainer.pl --self-test=patterns complains about a broken file pattern. Repair this file pattern in DIALOG SEMICONDUCTOR DRIVERS. Link: https://lkml.kernel.org/r/20230509074834.21521-1-lukas.bulwahn@gmail.com Fixes: 361104b05684c ("dt-bindings: mfd: Convert da9063 to yaml") Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Cc: Lee Jones <lee@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-17nilfs2: fix use-after-free bug of nilfs_root in nilfs_evict_inode()Ryusuke Konishi
During unmount process of nilfs2, nothing holds nilfs_root structure after nilfs2 detaches its writer in nilfs_detach_log_writer(). However, since nilfs_evict_inode() uses nilfs_root for some cleanup operations, it may cause use-after-free read if inodes are left in "garbage_list" and released by nilfs_dispose_list() at the end of nilfs_detach_log_writer(). Fix this issue by modifying nilfs_evict_inode() to only clear inode without additional metadata changes that use nilfs_root if the file system is degraded to read-only or the writer is detached. Link: https://lkml.kernel.org/r/20230509152956.8313-1-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+78d4495558999f55d1da@syzkaller.appspotmail.com Closes: https://lkml.kernel.org/r/00000000000099e5ac05fb1c3b85@google.com Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-17mm: fix zswap writeback race conditionDomenico Cerasuolo
The zswap writeback mechanism can cause a race condition resulting in memory corruption, where a swapped out page gets swapped in with data that was written to a different page. The race unfolds like this: 1. a page with data A and swap offset X is stored in zswap 2. page A is removed off the LRU by zpool driver for writeback in zswap-shrink work, data for A is mapped by zpool driver 3. user space program faults and invalidates page entry A, offset X is considered free 4. kswapd stores page B at offset X in zswap (zswap could also be full, if so, page B would then be IOed to X, then skip step 5.) 5. entry A is replaced by B in tree->rbroot, this doesn't affect the local reference held by zswap-shrink work 6. zswap-shrink work writes back A at X, and frees zswap entry A 7. swapin of slot X brings A in memory instead of B The fix: Once the swap page cache has been allocated (case ZSWAP_SWAPCACHE_NEW), zswap-shrink work just checks that the local zswap_entry reference is still the same as the one in the tree. If it's not the same it means that it's either been invalidated or replaced, in both cases the writeback is aborted because the local entry contains stale data. Reproducer: I originally found this by running `stress` overnight to validate my work on the zswap writeback mechanism, it manifested after hours on my test machine. The key to make it happen is having zswap writebacks, so whatever setup pumps /sys/kernel/debug/zswap/written_back_pages should do the trick. In order to reproduce this faster on a vm, I setup a system with ~100M of available memory and a 500M swap file, then running `stress --vm 1 --vm-bytes 300000000 --vm-stride 4000` makes it happen in matter of tens of minutes. One can speed things up even more by swinging /sys/module/zswap/parameters/max_pool_percent up and down between, say, 20 and 1; this makes it reproduce in tens of seconds. It's crucial to set `--vm-stride` to something other than 4096 otherwise `stress` won't realize that memory has been corrupted because all pages would have the same data. Link: https://lkml.kernel.org/r/20230503151200.19707-1-cerasuolodomenico@gmail.com Signed-off-by: Domenico Cerasuolo <cerasuolodomenico@gmail.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Chris Li (Google) <chrisl@kernel.org> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Seth Jennings <sjenning@redhat.com> Cc: Vitaly Wool <vitaly.wool@konsulko.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-17mm: kfence: fix false positives on big endianMichael Ellerman
Since commit 1ba3cbf3ec3b ("mm: kfence: improve the performance of __kfence_alloc() and __kfence_free()"), kfence reports failures in random places at boot on big endian machines. The problem is that the new KFENCE_CANARY_PATTERN_U64 encodes the address of each byte in its value, so it needs to be byte swapped on big endian machines. The compiler is smart enough to do the le64_to_cpu() at compile time, so there is no runtime overhead. Link: https://lkml.kernel.org/r/20230505035127.195387-1-mpe@ellerman.id.au Fixes: 1ba3cbf3ec3b ("mm: kfence: improve the performance of __kfence_alloc() and __kfence_free()") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Alexander Potapenko <glider@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: David Laight <David.Laight@ACULAB.COM> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-17zsmalloc: move LRU update from zs_map_object() to zs_malloc()Nhat Pham
Under memory pressure, we sometimes observe the following crash: [ 5694.832838] ------------[ cut here ]------------ [ 5694.842093] list_del corruption, ffff888014b6a448->next is LIST_POISON1 (dead000000000100) [ 5694.858677] WARNING: CPU: 33 PID: 418824 at lib/list_debug.c:47 __list_del_entry_valid+0x42/0x80 [ 5694.961820] CPU: 33 PID: 418824 Comm: fuse_counters.s Kdump: loaded Tainted: G S 5.19.0-0_fbk3_rc3_hoangnhatpzsdynshrv41_10870_g85a9558a25de #1 [ 5694.990194] Hardware name: Wiwynn Twin Lakes MP/Twin Lakes Passive MP, BIOS YMM16 05/24/2021 [ 5695.007072] RIP: 0010:__list_del_entry_valid+0x42/0x80 [ 5695.017351] Code: 08 48 83 c2 22 48 39 d0 74 24 48 8b 10 48 39 f2 75 2c 48 8b 51 08 b0 01 48 39 f2 75 34 c3 48 c7 c7 55 d7 78 82 e8 4e 45 3b 00 <0f> 0b eb 31 48 c7 c7 27 a8 70 82 e8 3e 45 3b 00 0f 0b eb 21 48 c7 [ 5695.054919] RSP: 0018:ffffc90027aef4f0 EFLAGS: 00010246 [ 5695.065366] RAX: 41fe484987275300 RBX: ffff888008988180 RCX: 0000000000000000 [ 5695.079636] RDX: ffff88886006c280 RSI: ffff888860060480 RDI: ffff888860060480 [ 5695.093904] RBP: 0000000000000002 R08: 0000000000000000 R09: ffffc90027aef370 [ 5695.108175] R10: 0000000000000000 R11: ffffffff82fdf1c0 R12: 0000000010000002 [ 5695.122447] R13: ffff888014b6a448 R14: ffff888014b6a420 R15: 00000000138dc240 [ 5695.136717] FS: 00007f23a7d3f740(0000) GS:ffff888860040000(0000) knlGS:0000000000000000 [ 5695.152899] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5695.164388] CR2: 0000560ceaab6ac0 CR3: 000000001c06c001 CR4: 00000000007706e0 [ 5695.178659] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 5695.192927] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 5695.207197] PKRU: 55555554 [ 5695.212602] Call Trace: [ 5695.217486] <TASK> [ 5695.221674] zs_map_object+0x91/0x270 [ 5695.229000] zswap_frontswap_store+0x33d/0x870 [ 5695.237885] ? do_raw_spin_lock+0x5d/0xa0 [ 5695.245899] __frontswap_store+0x51/0xb0 [ 5695.253742] swap_writepage+0x3c/0x60 [ 5695.261063] shrink_page_list+0x738/0x1230 [ 5695.269255] shrink_lruvec+0x5ec/0xcd0 [ 5695.276749] ? shrink_slab+0x187/0x5f0 [ 5695.284240] ? mem_cgroup_iter+0x6e/0x120 [ 5695.292255] shrink_node+0x293/0x7b0 [ 5695.299402] do_try_to_free_pages+0xea/0x550 [ 5695.307940] try_to_free_pages+0x19a/0x490 [ 5695.316126] __folio_alloc+0x19ff/0x3e40 [ 5695.323971] ? __filemap_get_folio+0x8a/0x4e0 [ 5695.332681] ? walk_component+0x2a8/0xb50 [ 5695.340697] ? generic_permission+0xda/0x2a0 [ 5695.349231] ? __filemap_get_folio+0x8a/0x4e0 [ 5695.357940] ? walk_component+0x2a8/0xb50 [ 5695.365955] vma_alloc_folio+0x10e/0x570 [ 5695.373796] ? walk_component+0x52/0xb50 [ 5695.381634] wp_page_copy+0x38c/0xc10 [ 5695.388953] ? filename_lookup+0x378/0xbc0 [ 5695.397140] handle_mm_fault+0x87f/0x1800 [ 5695.405157] do_user_addr_fault+0x1bd/0x570 [ 5695.413520] exc_page_fault+0x5d/0x110 [ 5695.421017] asm_exc_page_fault+0x22/0x30 After some investigation, I have found the following issue: unlike other zswap backends, zsmalloc performs the LRU list update at the object mapping time, rather than when the slot for the object is allocated. This deviation was discussed and agreed upon during the review process of the zsmalloc writeback patch series: https://lore.kernel.org/lkml/Y3flcAXNxxrvy3ZH@cmpxchg.org/ Unfortunately, this introduces a subtle bug that occurs when there is a concurrent store and reclaim, which interleave as follows: zswap_frontswap_store() shrink_worker() zs_malloc() zs_zpool_shrink() spin_lock(&pool->lock) zs_reclaim_page() zspage = find_get_zspage() spin_unlock(&pool->lock) spin_lock(&pool->lock) zspage = list_first_entry(&pool->lru) list_del(&zspage->lru) zspage->lru.next = LIST_POISON1 zspage->lru.prev = LIST_POISON2 spin_unlock(&pool->lock) zs_map_object() spin_lock(&pool->lock) if (!list_empty(&zspage->lru)) list_del(&zspage->lru) CHECK_DATA_CORRUPTION(next == LIST_POISON1) /* BOOM */ With the current upstream code, this issue rarely happens. zswap only triggers writeback when the pool is already full, at which point all further store attempts are short-circuited. This creates an implicit pseudo-serialization between reclaim and store. I am working on a new zswap shrinking mechanism, which makes interleaving reclaim and store more likely, exposing this bug. zbud and z3fold do not have this problem, because they perform the LRU list update in the alloc function, while still holding the pool's lock. This patch fixes the aforementioned bug by moving the LRU update back to zs_malloc(), analogous to zbud and z3fold. Link: https://lkml.kernel.org/r/20230505185054.2417128-1-nphamcs@gmail.com Fixes: 64f768c6b32e ("zsmalloc: add a LRU to zs_pool to keep track of zspages in LRU order") Signed-off-by: Nhat Pham <nphamcs@gmail.com> Suggested-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Acked-by: Minchan Kim <minchan@kernel.org> Cc: Dan Streetman <ddstreet@ieee.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Seth Jennings <sjenning@redhat.com> Cc: Vitaly Wool <vitaly.wool@konsulko.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-17mm: shrinkers: fix race condition on debugfs cleanupJoan Bruguera Micó
When something registers and unregisters many shrinkers, such as: for x in $(seq 10000); do unshare -Ui true; done Sometimes the following error is printed to the kernel log: debugfs: Directory '...' with parent 'shrinker' already present! This occurs since commit badc28d4924b ("mm: shrinkers: fix deadlock in shrinker debugfs") / v6.2: Since the call to `debugfs_remove_recursive` was moved outside the `shrinker_rwsem`/`shrinker_mutex` lock, but the call to `ida_free` stayed inside, a newly registered shrinker can be re-assigned that ID and attempt to create the debugfs directory before the directory from the previous shrinker has been removed. The locking changes in commit f95bdb700bc6 ("mm: vmscan: make global slab shrink lockless") made the race condition more likely, though it existed before then. Commit badc28d4924b ("mm: shrinkers: fix deadlock in shrinker debugfs") could be reverted since the issue is addressed should no longer occur since the count and scan operations are lockless since commit 20cd1892fcc3 ("mm: shrinkers: make count and scan in shrinker debugfs lockless"). However, since this is a contended lock, prefer instead moving `ida_free` outside the lock to avoid the race. Link: https://lkml.kernel.org/r/20230503013232.299211-1-joanbrugueram@gmail.com Fixes: badc28d4924b ("mm: shrinkers: fix deadlock in shrinker debugfs") Signed-off-by: Joan Bruguera Micó <joanbrugueram@gmail.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-17maple_tree: make maple state reusable after mas_empty_area()Peng Zhang
Make mas->min and mas->max point to a node range instead of a leaf entry range. This allows mas to still be usable after mas_empty_area() returns. Users would get unexpected results from other operations on the maple state after calling the affected function. For example, x86 MAP_32BIT mmap() acts as if there is no suitable gap when there should be one. Link: https://lkml.kernel.org/r/20230505145829.74574-1-zhangpeng.00@bytedance.com Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com> Reported-by: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com> Reported-by: Tad <support@spotco.us> Reported-by: Michael Keyes <mgkeyes@vigovproductions.net> Link: https://lore.kernel.org/linux-mm/32f156ba80010fd97dbaf0a0cdfc84366608624d.camel@intel.com/ Link: https://lore.kernel.org/linux-mm/e6108286ac025c268964a7ead3aab9899f9bc6e9.camel@spotco.us/ Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Tested-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-05-18rethook, fprobe: do not trace rethook related functionsZe Gao
These functions are already marked as NOKPROBE to prevent recursion and we have the same reason to blacklist them if rethook is used with fprobe, since they are beyond the recursion-free region ftrace can guard. Link: https://lore.kernel.org/all/20230517034510.15639-5-zegao@tencent.com/ Fixes: f3a112c0c40d ("x86,rethook,kprobes: Replace kretprobe with rethook on x86") Signed-off-by: Ze Gao <zegao@tencent.com> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2023-05-18fprobe: add recursion detection in fprobe_exit_handlerZe Gao
fprobe_hander and fprobe_kprobe_handler has guarded ftrace recursion detection but fprobe_exit_handler has not, which possibly introduce recursive calls if the fprobe exit callback calls any traceable functions. Checking in fprobe_hander or fprobe_kprobe_handler is not enough and misses this case. So add recursion free guard the same way as fprobe_hander. Since ftrace recursion check does not employ ip(s), so here use entry_ip and entry_parent_ip the same as fprobe_handler. Link: https://lore.kernel.org/all/20230517034510.15639-4-zegao@tencent.com/ Fixes: 5b0ab78998e3 ("fprobe: Add exit_handler support") Signed-off-by: Ze Gao <zegao@tencent.com> Cc: stable@vger.kernel.org Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2023-05-18fprobe: make fprobe_kprobe_handler recursion freeZe Gao
Current implementation calls kprobe related functions before doing ftrace recursion check in fprobe_kprobe_handler, which opens door to kernel crash due to stack recursion if preempt_count_{add, sub} is traceable in kprobe_busy_{begin, end}. Things goes like this without this patch quoted from Steven: " fprobe_kprobe_handler() { kprobe_busy_begin() { preempt_disable() { preempt_count_add() { <-- trace fprobe_kprobe_handler() { [ wash, rinse, repeat, CRASH!!! ] " By refactoring the common part out of fprobe_kprobe_handler and fprobe_handler and call ftrace recursion detection at the very beginning, the whole fprobe_kprobe_handler is free from recursion. [ Fix the indentation of __fprobe_handler() parameters. ] Link: https://lore.kernel.org/all/20230517034510.15639-3-zegao@tencent.com/ Fixes: ab51e15d535e ("fprobe: Introduce FPROBE_FL_KPROBE_SHARED flag for fprobe") Signed-off-by: Ze Gao <zegao@tencent.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>