Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
Pull sched_ext fix from jun Heo:
"This contains a fix for sched_ext idle CPU selection that likely fixes
a substantial performance regression.
The scx_bpf_select_cpu_dfl/and() kfuncs were incorrectly detecting all
tasks as migration-disabled when called outside ops.select_cpu(),
causing them to always return -EBUSY instead of finding idle CPUs.
The fix properly distinguishes between genuinely migration-disabled
tasks vs. the current task whose migration is temporarily disabled by
BPF execution"
* tag 'sched_ext-for-6.17-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
sched_ext: idle: Handle migration-disabled tasks in BPF code
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd
Pull iommufd fixes from Jason Gunthorpe:
"Fix two user triggerable use-after-free issues:
- Possible race UAF setting up mmaps
- Syzkaller found UAF when erroring an file descriptor creation ioctl
due to the fput() work queue"
* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd:
iommufd/selftest: Update the fail_nth limit
iommufd: WARN if an object is aborted with an elevated refcount
iommufd: Fix race during abort for file descriptors
iommufd: Fix refcounting race during mmap
|
|
Pull rdma fix from Jason Gunthorpe:
"Just a one line change, was expecting more rc stuff, but it has been
quiet.
- Fix mlx5 devx event delivery to userspace for certain kinds of SRQs"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
IB/mlx5: Fix obj_type mismatch for SRQ event subscriptions
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:
- work data memory corruption fix in amd_sfh (Basavaraj Natikar)
- fix for regression in cp2112 where setting a GPIO value would always
fail (Sébastien Szymanski)
- fix for regression in hid-lenovo causing driver to fail on non-ACPI
systems (Janne Grunau)
- a couple device ID additions and tiny device-specific quirks
* tag 'hid-for-linus-2025092201' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: amd_sfh: Add sync across amd sfh work functions
HID: asus: add support for missing PX series fn keys
HID: cp2112: fix setter callbacks return value
HID: lenovo: Use KEY_PERFORMANCE instead of ACPI's platform_profile
HID: intel-thc-hid: intel-quickspi: Add WCL Device IDs
HID: intel-thc-hid: intel-quicki2c: Add WCL Device IDs
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"Two small driver fixes for the Airhoa driver:
- Correct a PHY LED mux value so the PHY LED will blink as it should
- Fix the MDIO function bitmasks, working around a HW bug to
force-enable the MDIO pins"
* tag 'pinctrl-v6.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: airoha: fix wrong MDIO function bitmaks
pinctrl: airoha: fix wrong PHY LED mux value for LED1 GPIO46
|
|
When scx_bpf_select_cpu_dfl()/and() kfuncs are invoked outside of
ops.select_cpu() we can't rely on @p->migration_disabled to determine if
migration is disabled for the task @p.
In fact, migration is always disabled for the current task while running
BPF code: __bpf_prog_enter() disables migration and __bpf_prog_exit()
re-enables it.
To handle this, when @p->migration_disabled == 1, check whether @p is
the current task. If so, migration was not disabled before entering the
callback, otherwise migration was disabled.
This ensures correct idle CPU selection in all cases. The behavior of
ops.select_cpu() remains unchanged, because this callback is never
invoked for the current task and migration-disabled tasks are always
excluded.
Example: without this change scx_bpf_select_cpu_and() called from
ops.enqueue() always returns -EBUSY; with this change applied, it
correctly returns idle CPUs.
Fixes: 06efc9fe0b8de ("sched_ext: idle: Handle migration-disabled tasks in idle selection")
Cc: stable@vger.kernel.org # v6.16+
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Acked-by: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"Fixes to the Allwinner and Renesas clk drivers:
- Do the math properly in Allwinner's ccu_mp_recalc_rate() so clk
rates aren't bogus
- Fix a clock domain regression on Renesas R-Car M1A, R-Car H1,
and RZ/A1 by registering the domain after the pmdomain bus is
registered instead of before"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: sunxi-ng: mp: Fix dual-divider clock rate readback
clk: renesas: mstp: Add genpd OF provider at postcore_initcall()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull a few more btrfs fixes from David Sterba:
- in tree-checker, fix wrong size of check for inode ref item
- in ref-verify, handle combination of mount options that allow
partially damaged extent tree (reported by syzbot)
- additional validation of compression mount option to catch invalid
string as level
* tag 'for-6.17-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: reject invalid compression level
btrfs: ref-verify: handle damaged extent root tree
btrfs: tree-checker: fix the incorrect inode ref size check
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fix from James Bottomley:
"One driver fix for a dma error checking thinko"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: mcq: Fix memory allocation checks for SQE and CQE
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
Pull firewire fix from Takashi Sakamoto:
"When new structures and events were added to UAPI in v6.5 kernel, the
required update to the subsystem ABI version returned to userspace
client was overlooked. The version is now updated"
* tag 'firewire-fixes-6.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
firewire: core: fix overlooked update of subsystem ABI version
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Ingo Molnar:
"Fix a SEV-SNP regression when CONFIG_KVM_AMD_SEV is disabled"
* tag 'x86-urgent-2025-09-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/sev: Guard sev_evict_cache() with CONFIG_AMD_MEM_ENCRYPT
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into clk-fixes
Pull an Allwinner clk driver fix from Chen-Yu Tsai:
- One fix for the clock rate readback on the recently added dual
divider clocks
* tag 'sunxi-clk-fixes-for-6.17' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
clk: sunxi-ng: mp: Fix dual-divider clock rate readback
|
|
In kernel v6.5, several functions were added to the cdev layer. This
required updating the default version of subsystem ABI up to 6, but
this requirement was overlooked.
This commit updates the version accordingly.
Fixes: 6add87e9764d ("firewire: cdev: add new version of ABI to notify time stamp at request/response subaction of transaction#")
Link: https://lore.kernel.org/r/20250920025148.163402-1-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
|
|
Pull smb client fixes from Steve French:
- Two unlink fixes: one for rename and one for deferred close
- Four smbdirect/RDMA fixes: fix buffer leak in negotiate, two fixes
for races in smbd_destroy, fix offset and length checks in recv_done
* tag '6.17-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: fix smbdirect_recv_io leak in smbd_negotiate() error path
smb: client: fix file open check in __cifs_unlink()
smb: client: let smbd_destroy() call disable_work_sync(&info->post_send_credits_work)
smb: client: use disable[_delayed]_work_sync in smbdirect.c
smb: client: fix filename matching of deferred files
smb: client: let recv_done verify data_offset, data_length and remaining_data_length
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull iommu fixes from Joerg Roedel:
- Fixes for memory leak and memory corruption bugs on S390 and AMD-Vi
- Race condition fix in AMD-Vi page table code and S390 device attach
code
- Intel VT-d: Fix alignment checks in __domain_mapping()
- AMD-Vi: Fix potentially incorrect DTE settings when device has
aliases
* tag 'iommu-fixes-v6.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
iommu/amd/pgtbl: Fix possible race while increase page table level
iommu/amd: Fix alias device DTE setting
iommu/s390: Make attach succeed when the device was surprise removed
iommu/vt-d: Fix __domain_mapping()'s usage of switch_to_super_page()
iommu/s390: Fix memory corruption when using identity domain
iommu/amd: Fix ivrs_base memleak in early_amd_iommu_init()
|
|
Pull block fixes from Jens Axboe:
"A set of fixes for an issue with md array assembly and drbd for
devices supporting write zeros"
* tag 'block-6.17-20250918' of git://git.kernel.dk/linux:
drbd: init queue_limits->max_hw_wzeroes_unmap_sectors parameter
md: init queue_limits->max_hw_wzeroes_unmap_sectors parameter
|
|
Pull io_uring fixes from Jens Axboe:
- Fix for a regression introduced in the io-wq worker creation logic.
- Remove the allocation cache for the msg_ring io_kiocb allocations. I
have a suspicion that there's a bug there, and since we just fixed
one in that area, let's just yank the use of that cache entirely.
It's not that important, and it kills some code.
- Treat a closed ring like task exiting in that any requests that
trigger post that condition should just get canceled. Doesn't fix any
real issues, outside of having tasks being able to rely on that
guarantee.
- Fix for a bug in the network zero-copy notification mechanism, where
a comparison for matching tctx/ctx for notifications was buggy in
that it didn't correctly compare with the previous notification.
* tag 'io_uring-6.17-20250919' of git://git.kernel.dk/linux:
io_uring: fix incorrect io_kiocb reference in io_link_skb
io_uring/msg_ring: kill alloc_cache for io_kiocb allocations
io_uring: include dying ring in task_work "should cancel" state
io_uring/io-wq: fix `max_workers` breakage and `nr_workers` underflow
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- fix an ACPI I2C HID driver breakage due to not initializing a
structure on the stack and passing garbage down to GPIO core
- ignore touchpad wakeup on GPD G1619-05
- fix debouncing configuration when looking up GPIOs in ACPI
* tag 'gpio-fixes-for-v6.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpiolib: acpi: initialize acpi_gpio_info struct
gpiolib: acpi: Ignore touchpad wakeup on GPD G1619-05
gpiolib: acpi: Program debounce when finding GPIO
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC host fixes from Ulf Hansson:
- mvsdio: Fix dma_unmap_sg() nents value
- sdhci: Fix clock management for UHS-II
- sdhci-pci-gli: Fix initialization of UHS-II for GL9767
* tag 'mmc-v6.17-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci-pci-gli: GL9767: Fix initializing the UHS-II interface during a power-on
mmc: sdhci-uhs2: Fix calling incorrect sdhci_set_clock() function
mmc: sdhci: Move the code related to setting the clock from sdhci_set_ios_common() into sdhci_set_ios()
mmc: mvsdio: Fix dma_unmap_sg() nents value
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm
Pull pmdomain fixes from Ulf Hansson:
"pmdomain core:
- Restore behaviour for disabling unused PM domains and introduce the
GENPD_FLAG_NO_STAY_ON configuration bit
pmdomain providers:
- renesas: Don't keep unused PM domains powered-on
- rockchip: Fix regulator dependency with GENPD_FLAG_NO_STAY_ON"
* tag 'pmdomain-v6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
pmdomain: renesas: rmobile-sysc: Don't keep unused PM domains powered-on
pmdomain: renesas: rcar-gen4-sysc: Don't keep unused PM domains powered-on
pmdomain: renesas: rcar-sysc: Don't keep unused PM domains powered-on
pmdomain: rockchip: Fix regulator dependency with GENPD_FLAG_NO_STAY_ON
pmdomain: core: Restore behaviour for disabling unused PM domains
pmdomain: renesas: rcar-sysc: Make rcar_sysc_onecell_np __initdata
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
"Fix some build warnings for RUST-enabled objtool check, align ACPI
structures for ARCH_STRICT_ALIGN, fix an unreliable stack for live
patching, add some NULL pointer checkings, and fix some bugs around
KVM"
* tag 'loongarch-fixes-6.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: KVM: Avoid copy_*_user() with lock hold in kvm_pch_pic_regs_access()
LoongArch: KVM: Avoid copy_*_user() with lock hold in kvm_eiointc_sw_status_access()
LoongArch: KVM: Avoid copy_*_user() with lock hold in kvm_eiointc_regs_access()
LoongArch: KVM: Avoid copy_*_user() with lock hold in kvm_eiointc_ctrl_access()
LoongArch: KVM: Fix VM migration failure with PTW enabled
LoongArch: KVM: Remove unused returns and semicolons
LoongArch: vDSO: Check kcalloc() result in init_vdso()
LoongArch: Fix unreliable stack for live patching
LoongArch: Replace sprintf() with sysfs_emit()
LoongArch: Check the return value when creating kobj
LoongArch: Align ACPI structures if ARCH_STRICT_ALIGN enabled
LoongArch: Update help info of ARCH_STRICT_ALIGN
LoongArch: Handle jump tables options for RUST
LoongArch: Make LTO case independent in Makefile
objtool/LoongArch: Mark special atomic instruction as INSN_BUG type
objtool/LoongArch: Mark types based on break immediate code
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"This fixes a NULL pointer dereference in ccp and a couple of bugs in
the af_alg interface"
* tag 'v6.17-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: af_alg - Disallow concurrent writes in af_alg_sendmsg
crypto: af_alg - Set merge to zero early in af_alg_sendmsg
crypto: ccp - Always pass in an error pointer to __sev_platform_shutdown_locked()
|
|
The process of the report is delegated across different work functions.
Hence, add a sync mechanism to protect SFH work data across functions.
Fixes: 4b2c53d93a4b ("SFH:Transport Driver to add support of AMD Sensor Fusion Hub (SFH)")
Reported-by: Matthew Schwartz <matthew.schwartz@linux.dev>
Closes: https://lore.kernel.org/all/a21abca5-4268-449d-95f1-bdd7a25894a5@linux.dev/
Tested-by: Prakruthi SP <Prakruthi.SP@amd.com>
Co-developed-by: Akshata MukundShetty <akshata.mukundshetty@amd.com>
Signed-off-by: Akshata MukundShetty <akshata.mukundshetty@amd.com>
Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of small fixes. The volume became higher than wished, but
nothing really stands out -- all small, nice and smooth.
A slightly large change is found in qcom USB-audio offload stuff, but
this is a regression fix specific to this device, hence it should be
safe to apply at this late stage.
- Various small fixes for ASoC Cirrus, Realtek, lpass, Intel and
Qualcomm drivers
- ASoC SoundWire fixes
- A few TAS2781 HD-audio side-codec driver fixes
- A fix for Qualcomm USB-audio offload breakage
- Usual a few HD-audio quirks"
* tag 'sound-6.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (35 commits)
ALSA: hda/realtek: Fix mute led for HP Laptop 15-dw4xx
ALSA: hda: intel-dsp-config: Prevent SEGFAULT if ACPI_HANDLE() is NULL
ALSA: usb: qcom: Fix false-positive address space check
ASoC: rt5682s: Adjust SAR ADC button mode to fix noise issue
ASoC: Intel: PTL: Add entry for HDMI-In capture support to non-I2S codec boards.
ASoC: amd: acp: Fix incorrect retrival of acp_chip_info
ASoC: Intel: sof_sdw: use PRODUCT_FAMILY for Fatcat series
ASoC: qcom: sc8280xp: Fix sound card driver name match data for QCS8275
ALSA: hda/realtek: Fix volume control on Lenovo Thinkbook 13x Gen 4
ALSA: hda/realtek: Support Lenovo Thinkbook 13x Gen 5
ALSA: hda: cs35l41: Support Lenovo Thinkbook 13x Gen 5
ALSA: hda/realtek: Add ALC295 Dell TAS2781 I2C fixup
ALSA: hda/tas2781: Fix a potential race condition that causes a NULL pointer in case no efi.get_variable exsits
ASoC: qcom: sc8280xp: Enable DAI format configuration for MI2S interfaces
ASoC: qcom: q6apm-lpass-dais: Fix missing set_fmt DAI op for I2S
ASoC: qcom: audioreach: Fix lpaif_type configuration for the I2S interface
ASoC: Intel: catpt: Expose correct bit depth to userspace
ALSA: hda/tas2781: Fix the order of TAS2781 calibrated-data
ASoC: codecs: lpass-wsa-macro: Fix speaker quality distortion
ASoC: codecs: lpass-rx-macro: Fix playback quality distortion
...
|
|
Pull drm fixes from Dave Airlie:
"Weekly fixes for drm, it's a bit busier than I'd like on the xe side
this week, but otherwise amdgpu and some smaller fixes for i915/bridge
and a revert on docs.
docs:
- fix docs build regression
i915:
- Honor VESA eDP backlight luminance control capability
bridge:
- anx7625: Fix NULL pointer dereference with early IRQ
- cdns-mhdp8546: Fix missing mutex unlock on error path
xe:
- Release kobject for the failure path
- SRIOV PF: Drop rounddown_pow_of_two fair
- Remove type casting on hwmon
- Defer free of NVM auxiliary container to device release
- Fix a NULL vs IS_ERR
- Add cleanup action in xe_device_sysfs_init
- Fix error handling if PXP fails to start
- Set GuC RCS/CCS yield policy
amdgpu:
- GC 11.0.1/4 cleaner shader support
- DC irq fix
- OD fix
amdkfd:
- S0ix fix"
* tag 'drm-fixes-2025-09-19' of https://gitlab.freedesktop.org/drm/kernel:
drm/amdgpu: suspend KFD and KGD user queues for S0ix
drm/amdkfd: add proper handling for S0ix
drm/xe/guc: Set RCS/CCS yield policy
drm/xe: Fix error handling if PXP fails to start
drm/xe/sysfs: Add cleanup action in xe_device_sysfs_init
drm/amd: Only restore cached manual clock settings in restore if OD enabled
drm/xe: Fix a NULL vs IS_ERR() in xe_vm_add_compute_exec_queue()
drm: bridge: cdns-mhdp8546: Fix missing mutex unlock on error path
drm/i915/backlight: Honor VESA eDP backlight luminance control capability
drm/amd/display: Allow RX6xxx & RX7700 to invoke amdgpu_irq_get/put
drm/amdgpu/gfx11: Add Cleaner Shader Support for GFX11.0.1/11.0.4 GPUs
drm: bridge: anx7625: Fix NULL pointer dereference with early IRQ
drm/xe: defer free of NVM auxiliary container to device release callback
drm/xe/hwmon: Remove type casting
drm/xe/pf: Drop rounddown_pow_of_two fair LMEM limitation
drm/xe/tile: Release kobject for the failure path
Revert "drm: Add directive to format code in comment"
|
|
There are more failure conditions now so 400 iterations is not enough pass
them all, up it to 1000. The limit exists so it doesn't infinite loop.
Link: https://patch.msgid.link/r/3-v1-02cd136829df+31-iommufd_syz_fput_jgg@nvidia.com
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
If something holds a refcount then it is at risk of UAFing. For abort
paths we expect the caller to never share the object with a parallel
thread and to clean up any refcounts it obtained on its own.
Add the missing dec inside iommufd_hwpt_paging_alloc() during error unwind
by making iommufd_hw_pagetable_attach/detach() proper pairs.
Link: https://patch.msgid.link/r/2-v1-02cd136829df+31-iommufd_syz_fput_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
fput() doesn't actually call file_operations release() synchronously, it
puts the file on a work queue and it will be released eventually.
This is normally fine, except for iommufd the file and the iommufd_object
are tied to gether. The file has the object as it's private_data and holds
a users refcount, while the object is expected to remain alive as long as
the file is.
When the allocation of a new object aborts before installing the file it
will fput() the file and then go on to immediately kfree() the obj. This
causes a UAF once the workqueue completes the fput() and tries to
decrement the users refcount.
Fix this by putting the core code in charge of the file lifetime, and call
__fput_sync() during abort to ensure that release() is called before
kfree. __fput_sync() is a bit too tricky to open code in all the object
implementations. Instead the objects tell the core code where the file
pointer is and the core will take care of the life cycle.
If the object is successfully allocated then the file will hold a users
refcount and the iommufd_object cannot be destroyed.
It is worth noting that close(); ioctl(IOMMU_DESTROY); doesn't have an
issue because close() is already using a synchronous version of fput().
The UAF looks like this:
BUG: KASAN: slab-use-after-free in iommufd_eventq_fops_release+0x45/0xc0 drivers/iommu/iommufd/eventq.c:376
Write of size 4 at addr ffff888059c97804 by task syz.0.46/6164
CPU: 0 UID: 0 PID: 6164 Comm: syz.0.46 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/18/2025
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0xcd/0x630 mm/kasan/report.c:482
kasan_report+0xe0/0x110 mm/kasan/report.c:595
check_region_inline mm/kasan/generic.c:183 [inline]
kasan_check_range+0x100/0x1b0 mm/kasan/generic.c:189
instrument_atomic_read_write include/linux/instrumented.h:96 [inline]
atomic_fetch_sub_release include/linux/atomic/atomic-instrumented.h:400 [inline]
__refcount_dec include/linux/refcount.h:455 [inline]
refcount_dec include/linux/refcount.h:476 [inline]
iommufd_eventq_fops_release+0x45/0xc0 drivers/iommu/iommufd/eventq.c:376
__fput+0x402/0xb70 fs/file_table.c:468
task_work_run+0x14d/0x240 kernel/task_work.c:227
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
exit_to_user_mode_loop+0xeb/0x110 kernel/entry/common.c:43
exit_to_user_mode_prepare include/linux/irq-entry-common.h:225 [inline]
syscall_exit_to_user_mode_work include/linux/entry-common.h:175 [inline]
syscall_exit_to_user_mode include/linux/entry-common.h:210 [inline]
do_syscall_64+0x41c/0x4c0 arch/x86/entry/syscall_64.c:100
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Link: https://patch.msgid.link/r/1-v1-02cd136829df+31-iommufd_syz_fput_jgg@nvidia.com
Cc: stable@vger.kernel.org
Fixes: 07838f7fd529 ("iommufd: Add iommufd fault object")
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Nirmoy Das <nirmoyd@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Reported-by: syzbot+80620e2d0d0a33b09f93@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/r/68c8583d.050a0220.2ff435.03a2.GAE@google.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The owner object of the imap can be destroyed while the imap remains in
the mtree. So access to the imap pointer without holding locks is racy
with destruction.
The imap is safe to access outside the lock once a users refcount is
obtained, the owner object cannot start destruction until users is 0.
Thus the users refcount should not be obtained at the end of
iommufd_fops_mmap() but instead inside the mtree lock held around the
mtree_load(). Move the refcount there and use refcount_inc_not_zero() as
we can have a 0 refcount inside the mtree during destruction races.
Link: https://patch.msgid.link/r/0-v1-e6faace50971+3cc-iommufd_mmap_fix_jgg@nvidia.com
Cc: stable@vger.kernel.org
Fixes: 56e9a0d8e53f ("iommufd: Add mmap interface")
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
In io_link_skb function, there is a bug where prev_notif is incorrectly
assigned using 'nd' instead of 'prev_nd'. This causes the context
validation check to compare the current notification with itself instead
of comparing it with the previous notification.
Fix by using the correct prev_nd parameter when obtaining prev_notif.
Signed-off-by: Yang Xiuwei <yangxiuwei@kylinos.cn>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Fixes: 6fe4220912d19 ("io_uring/notif: implement notification stacking")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The AMD IOMMU host page table implementation supports dynamic page table levels
(up to 6 levels), starting with a 3-level configuration that expands based on
IOVA address. The kernel maintains a root pointer and current page table level
to enable proper page table walks in alloc_pte()/fetch_pte() operations.
The IOMMU IOVA allocator initially starts with 32-bit address and onces its
exhuasted it switches to 64-bit address (max address is determined based
on IOMMU and device DMA capability). To support larger IOVA, AMD IOMMU
driver increases page table level.
But in unmap path (iommu_v1_unmap_pages()), fetch_pte() reads
pgtable->[root/mode] without lock. So its possible that in exteme corner case,
when increase_address_space() is updating pgtable->[root/mode], fetch_pte()
reads wrong page table level (pgtable->mode). It does compare the value with
level encoded in page table and returns NULL. This will result is
iommu_unmap ops to fail and upper layer may retry/log WARN_ON.
CPU 0 CPU 1
------ ------
map pages unmap pages
alloc_pte() -> increase_address_space() iommu_v1_unmap_pages() -> fetch_pte()
pgtable->root = pte (new root value)
READ pgtable->[mode/root]
Reads new root, old mode
Updates mode (pgtable->mode += 1)
Since Page table level updates are infrequent and already synchronized with a
spinlock, implement seqcount to enable lock-free read operations on the read path.
Fixes: 754265bcab7 ("iommu/amd: Fix race in increase_address_space()")
Reported-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Cc: stable@vger.kernel.org
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.17-2025-09-18:
amdgpu:
- GC 11.0.1/4 cleaner shader support
- DC irq fix
- OD fix
amdkfd:
- S0ix fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250918191428.2553105-1-alexander.deucher@amd.com
|
|
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
- Release kobject for the failure path (Shuicheng)
- SRIOV PF: Drop rounddown_pow_of_two fair (Michal)
- Remove type casting on hwmon (Mallesh)
- Defer free of NVM auxiliary container to device release (Nitin)
- Fix a NULL vs IS_ERR (Dan)
- Add cleanup action in xe_device_sysfs_init (Zongyao)
- Fix error handling if PXP fails to start (Daniele)
- Set GuC RCS/CCS yield policy (Daniele)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/aMwL7vxFP1L94IML@intel.com
|
|
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
One fix for a documentation warning, a null pointer dereference fix for
anx7625, and a mutex unlock fix for cdns-mhdp8546
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <mripard@redhat.com>
Link: https://lore.kernel.org/r/20250918-orthodox-pretty-puma-1ddeea@houat
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull runtime verifier fixes from Steven Rostedt:
- Fix build in some RISC-V flavours
Some system calls only are available for the 64bit RISC-V machines.
#ifdef out the cases of clock_nanosleep and futex in the sleep
monitor if they are not supported by the architecture.
- Fix wrong cast, obsolete after refactoring
Use container_of() to get to the rv_monitor structure from the
enable_monitors_next() 'p' pointer. The assignment worked only
because the list field used happened to be the first field of the
structure.
- Remove redundant include files
Some include files were listed twice. Remove the extra ones and sort
the includes.
- Fix missing unlock on failure
There was an error path that exited the rv_register_monitor()
function without releasing a lock. Change that to goto the lock
release.
- Add Gabriele Monaco to be Runtime Verifier maintainer
Gabriele is doing most of the work on RV as well as collecting
patches. Add him to the maintainers file for Runtime Verification.
* tag 'trace-rv-v6.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
rv: Add Gabriele Monaco as maintainer for Runtime Verification
rv: Fix missing mutex unlock in rv_register_monitor()
include/linux/rv.h: remove redundant include file
rv: Fix wrong type cast in enabled_monitors_next()
rv: Support systems with time64-only syscalls
|
|
During tests of another unrelated patch I was able to trigger this
error: Objects remaining on __kmem_cache_shutdown()
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Fixes: f198186aa9bb ("CIFS: SMBD: Establish SMB Direct connection")
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Fix the file open check to decide whether or not silly-rename the file
in SMB2+.
Fixes: c5ea3065586d ("smb: client: fix data loss due to broken rename(2)")
Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Cc: Frank Sorenson <sorenson@redhat.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Cc: linux-cifs@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Gabriele will start taking over managing the changes to the Runtime
Verification. Make him officially one of the maintainers.
Cc: Gabriele Monaco <gmonaco@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/20250911115744.66ccade3@gandalf.local.home
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
A recent commit:
fc582cd26e88 ("io_uring/msg_ring: ensure io_kiocb freeing is deferred for RCU")
fixed an issue with not deferring freeing of io_kiocb structs that
msg_ring allocates to after the current RCU grace period. But this only
covers requests that don't end up in the allocation cache. If a request
goes into the alloc cache, it can get reused before it is sane to do so.
A recent syzbot report would seem to indicate that there's something
there, however it may very well just be because of the KASAN poisoning
that the alloc_cache handles manually.
Rather than attempt to make the alloc_cache sane for that use case, just
drop the usage of the alloc_cache for msg_ring request payload data.
Fixes: 50cf5f3842af ("io_uring/msg_ring: add an alloc cache for io_kiocb entries")
Link: https://lore.kernel.org/io-uring/68cc2687.050a0220.139b6.0005.GAE@google.com/
Reported-by: syzbot+baa2e0f4e02df602583e@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This laptop uses the ALC236 codec with COEF 0x7 and idx 1 to
control the mute LED. Enable the existing quirk for this device.
Signed-off-by: Praful Adiga <praful.adiga@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
We need to make sure the user queues are preempted so
GFX can enter gfxoff.
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Tested-by: David Perry <david.perry@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f8b367e6fa1716cab7cc232b9e3dff29187fc99d)
Cc: stable@vger.kernel.org
|
|
When in S0i3, the GFX state is retained, so all we need to do
is stop the runlist so GFX can enter gfxoff.
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Tested-by: David Perry <david.perry@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 4bfa8609934dbf39bbe6e75b4f971469384b50b1)
Cc: stable@vger.kernel.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from wireless. No known regressions at this point.
Current release - fix to a fix:
- eth: Revert "net/mlx5e: Update and set Xon/Xoff upon port speed set"
- wifi: iwlwifi: pcie: fix byte count table for 7000/8000 devices
- net: clear sk->sk_ino in sk_set_socket(sk, NULL), fix CRIU
Previous releases - regressions:
- bonding: set random address only when slaves already exist
- rxrpc: fix untrusted unsigned subtract
- eth:
- ice: fix Rx page leak on multi-buffer frames
- mlx5: don't return mlx5_link_info table when speed is unknown
Previous releases - always broken:
- tls: make sure to abort the stream if headers are bogus
- tcp: fix null-deref when using TCP-AO with TCP_REPAIR
- dpll: fix skipping last entry in clock quality level reporting
- eth: qed: don't collect too many protection override GRC elements,
fix memory corruption"
* tag 'net-6.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (51 commits)
octeontx2-pf: Fix use-after-free bugs in otx2_sync_tstamp()
cnic: Fix use-after-free bugs in cnic_delete_task
devlink rate: Remove unnecessary 'static' from a couple places
MAINTAINERS: update sundance entry
net: liquidio: fix overflow in octeon_init_instr_queue()
net: clear sk->sk_ino in sk_set_socket(sk, NULL)
Revert "net/mlx5e: Update and set Xon/Xoff upon port speed set"
selftests: tls: test skb copy under mem pressure and OOB
tls: make sure to abort the stream if headers are bogus
selftest: packetdrill: Add tcp_fastopen_server_reset-after-disconnect.pkt.
tcp: Clear tcp_sk(sk)->fastopen_rsk in tcp_disconnect().
octeon_ep: fix VF MAC address lifecycle handling
selftests: bonding: add vlan over bond testing
bonding: don't set oif to bond dev when getting NS target destination
net: rfkill: gpio: Fix crash due to dereferencering uninitialized pointer
net/mlx5e: Add a miss level for ipsec crypto offload
net/mlx5e: Harden uplink netdev access against device unbind
MAINTAINERS: make the DPLL entry cover drivers
doc/netlink: Fix typos in operation attributes
igc: don't fail igc_probe() on LED setup error
...
|
|
Pull kvm fixes from Paolo Bonzini:
"These are mostly Oliver's Arm changes: lock ordering fixes for the
vGIC, and reverts for a buggy attempt to avoid RCU stalls on large
VMs.
Arm:
- Invalidate nested MMUs upon freeing the PGD to avoid WARNs when
visiting from an MMU notifier
- Fixes to the TLB match process and TLB invalidation range for
managing the VCNR pseudo-TLB
- Prevent SPE from erroneously profiling guests due to UNKNOWN reset
values in PMSCR_EL1
- Fix save/restore of host MDCR_EL2 to account for eagerly
programming at vcpu_load() on VHE systems
- Correct lock ordering when dealing with VGIC LPIs, avoiding
scenarios where an xarray's spinlock was nested with a *raw*
spinlock
- Permit stage-2 read permission aborts which are possible in the
case of NV depending on the guest hypervisor's stage-2 translation
- Call raw_spin_unlock() instead of the internal spinlock API
- Fix parameter ordering when assigning VBAR_EL1
- Reverted a couple of fixes for RCU stalls when destroying a stage-2
page table.
There appears to be some nasty refcounting / UAF issues lurking in
those patches and the band-aid we tried to apply didn't hold.
s390:
- mm fixes, including userfaultfd bug fix
x86:
- Sync the vTPR from the local APIC to the VMCB even when AVIC is
active.
This fixes a bug where host updates to the vTPR, e.g. via
KVM_SET_LAPIC or emulation of a guest access, are lost and result
in interrupt delivery issues in the guest"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: SVM: Sync TPR from LAPIC into VMCB::V_TPR even if AVIC is active
Revert "KVM: arm64: Split kvm_pgtable_stage2_destroy()"
Revert "KVM: arm64: Reschedule as needed when destroying the stage-2 page-tables"
KVM: arm64: vgic: fix incorrect spinlock API usage
KVM: arm64: Remove stage 2 read fault check
KVM: arm64: Fix parameter ordering for VBAR_EL1 assignment
KVM: arm64: nv: Fix incorrect VNCR invalidation range calculation
KVM: arm64: vgic-v3: Indicate vgic_put_irq() may take LPI xarray lock
KVM: arm64: vgic-v3: Don't require IRQs be disabled for LPI xarray lock
KVM: arm64: vgic-v3: Erase LPIs from xarray outside of raw spinlocks
KVM: arm64: Spin off release helper from vgic_put_irq()
KVM: arm64: vgic-v3: Use bare refcount for VGIC LPIs
KVM: arm64: vgic: Drop stale comment on IRQ active state
KVM: arm64: VHE: Save and restore host MDCR_EL2 value correctly
KVM: arm64: Initialize PMSCR_EL1 when in VHE
KVM: arm64: nv: fix VNCR TLB ASID match logic for non-Global entries
KVM: s390: Fix FOLL_*/FAULT_FLAG_* confusion
KVM: s390: Fix incorrect usage of mmu_notifier_register()
KVM: s390: Fix access to unavailable adapter indicator pages during postcopy
KVM: arm64: Mark freed S2 MMUs as invalid
|
|
When running task_work for an exiting task, rather than perform the
issue retry attempt, the task_work is canceled. However, this isn't
done for a ring that has been closed. This can lead to requests being
successfully completed post the ring being closed, which is somewhat
confusing and surprising to an application.
Rather than just check the task exit state, also include the ring
ref state in deciding whether or not to terminate a given request when
run from task_work.
Cc: stable@vger.kernel.org # 6.1+
Link: https://github.com/axboe/liburing/discussions/1459
Reported-by: Benedek Thaler <thaler@thaler.hu>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Ilpo Järvinen:
"Fixes and new HW support:
- amd/pmc: Add MECHREVO Yilong15Pro to spurious_8042 list
- amd/pmf: Support new ACPI ID AMDI0108
- asus-wmi: Re-add extra keys to ignore_key_wlan quirk
- oxpec: Add support for AOKZOE A1X and OneXPlayer X1Pro EVA-02"
* tag 'platform-drivers-x86-v6.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: asus-wmi: Re-add extra keys to ignore_key_wlan quirk
platform/x86/amd/pmf: Support new ACPI ID AMDI0108
platform/x86: oxpec: Add support for AOKZOE A1X
platform/x86: oxpec: Add support for OneXPlayer X1Pro EVA-02
platform/x86/amd/pmc: Add MECHREVO Yilong15Pro to spurious_8042 list
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux
Pull UML fixes from Johannes Berg:
"A few fixes for UML, which I'd meant to send earlier but then forgot.
All of them are pretty long-standing issues that are either not really
happening (the UAF), in rarely used code (the FD buffer issue), or an
issue only for some host configurations (the executable stack):
- mark stack not executable to work on more modern systems with
selinux
- fix use-after-free in a virtio error path
- fix stack buffer overflow in external unix socket FD receive
function"
* tag 'uml-for-6.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux:
um: Fix FD copy size in os_rcv_fd_msg()
um: virtio_uml: Fix use-after-free after put_device in probe
um: Don't mark stack executable
|
|
The original code relies on cancel_delayed_work() in otx2_ptp_destroy(),
which does not ensure that the delayed work item synctstamp_work has fully
completed if it was already running. This leads to use-after-free scenarios
where otx2_ptp is deallocated by otx2_ptp_destroy(), while synctstamp_work
remains active and attempts to dereference otx2_ptp in otx2_sync_tstamp().
Furthermore, the synctstamp_work is cyclic, the likelihood of triggering
the bug is nonnegligible.
A typical race condition is illustrated below:
CPU 0 (cleanup) | CPU 1 (delayed work callback)
otx2_remove() |
otx2_ptp_destroy() | otx2_sync_tstamp()
cancel_delayed_work() |
kfree(ptp) |
| ptp = container_of(...); //UAF
| ptp-> //UAF
This is confirmed by a KASAN report:
BUG: KASAN: slab-use-after-free in __run_timer_base.part.0+0x7d7/0x8c0
Write of size 8 at addr ffff88800aa09a18 by task bash/136
...
Call Trace:
<IRQ>
dump_stack_lvl+0x55/0x70
print_report+0xcf/0x610
? __run_timer_base.part.0+0x7d7/0x8c0
kasan_report+0xb8/0xf0
? __run_timer_base.part.0+0x7d7/0x8c0
__run_timer_base.part.0+0x7d7/0x8c0
? __pfx___run_timer_base.part.0+0x10/0x10
? __pfx_read_tsc+0x10/0x10
? ktime_get+0x60/0x140
? lapic_next_event+0x11/0x20
? clockevents_program_event+0x1d4/0x2a0
run_timer_softirq+0xd1/0x190
handle_softirqs+0x16a/0x550
irq_exit_rcu+0xaf/0xe0
sysvec_apic_timer_interrupt+0x70/0x80
</IRQ>
...
Allocated by task 1:
kasan_save_stack+0x24/0x50
kasan_save_track+0x14/0x30
__kasan_kmalloc+0x7f/0x90
otx2_ptp_init+0xb1/0x860
otx2_probe+0x4eb/0xc30
local_pci_probe+0xdc/0x190
pci_device_probe+0x2fe/0x470
really_probe+0x1ca/0x5c0
__driver_probe_device+0x248/0x310
driver_probe_device+0x44/0x120
__driver_attach+0xd2/0x310
bus_for_each_dev+0xed/0x170
bus_add_driver+0x208/0x500
driver_register+0x132/0x460
do_one_initcall+0x89/0x300
kernel_init_freeable+0x40d/0x720
kernel_init+0x1a/0x150
ret_from_fork+0x10c/0x1a0
ret_from_fork_asm+0x1a/0x30
Freed by task 136:
kasan_save_stack+0x24/0x50
kasan_save_track+0x14/0x30
kasan_save_free_info+0x3a/0x60
__kasan_slab_free+0x3f/0x50
kfree+0x137/0x370
otx2_ptp_destroy+0x38/0x80
otx2_remove+0x10d/0x4c0
pci_device_remove+0xa6/0x1d0
device_release_driver_internal+0xf8/0x210
pci_stop_bus_device+0x105/0x150
pci_stop_and_remove_bus_device_locked+0x15/0x30
remove_store+0xcc/0xe0
kernfs_fop_write_iter+0x2c3/0x440
vfs_write+0x871/0xd70
ksys_write+0xee/0x1c0
do_syscall_64+0xac/0x280
entry_SYSCALL_64_after_hwframe+0x77/0x7f
...
Replace cancel_delayed_work() with cancel_delayed_work_sync() to ensure
that the delayed work item is properly canceled before the otx2_ptp is
deallocated.
This bug was initially identified through static analysis. To reproduce
and test it, I simulated the OcteonTX2 PCI device in QEMU and introduced
artificial delays within the otx2_sync_tstamp() function to increase the
likelihood of triggering the bug.
Fixes: 2958d17a8984 ("octeontx2-pf: Add support for ptp 1-step mode on CN10K silicon")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The original code uses cancel_delayed_work() in cnic_cm_stop_bnx2x_hw(),
which does not guarantee that the delayed work item 'delete_task' has
fully completed if it was already running. Additionally, the delayed work
item is cyclic, the flush_workqueue() in cnic_cm_stop_bnx2x_hw() only
blocks and waits for work items that were already queued to the
workqueue prior to its invocation. Any work items submitted after
flush_workqueue() is called are not included in the set of tasks that the
flush operation awaits. This means that after the cyclic work items have
finished executing, a delayed work item may still exist in the workqueue.
This leads to use-after-free scenarios where the cnic_dev is deallocated
by cnic_free_dev(), while delete_task remains active and attempt to
dereference cnic_dev in cnic_delete_task().
A typical race condition is illustrated below:
CPU 0 (cleanup) | CPU 1 (delayed work callback)
cnic_netdev_event() |
cnic_stop_hw() | cnic_delete_task()
cnic_cm_stop_bnx2x_hw() | ...
cancel_delayed_work() | /* the queue_delayed_work()
flush_workqueue() | executes after flush_workqueue()*/
| queue_delayed_work()
cnic_free_dev(dev)//free | cnic_delete_task() //new instance
| dev = cp->dev; //use
Replace cancel_delayed_work() with cancel_delayed_work_sync() to ensure
that the cyclic delayed work item is properly canceled and that any
ongoing execution of the work item completes before the cnic_dev is
deallocated. Furthermore, since cancel_delayed_work_sync() uses
__flush_work(work, true) to synchronously wait for any currently
executing instance of the work item to finish, the flush_workqueue()
becomes redundant and should be removed.
This bug was identified through static analysis. To reproduce the issue
and validate the fix, I simulated the cnic PCI device in QEMU and
introduced intentional delays — such as inserting calls to ssleep()
within the cnic_delete_task() function — to increase the likelihood
of triggering the bug.
Fixes: fdf24086f475 ("cnic: Defer iscsi connection cleanup")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|