Age | Commit message (Collapse) | Author |
|
For WMI_REQUEST_VDEV_STAT request, firmware might split response into
multiple events dut to buffer limit, hence currently in
ath11k_debugfs_fw_stats_process() we wait until all events received.
In case there is no vdev started, this results in that below condition
would never get satisfied
((++ar->fw_stats.num_vdev_recvd) == total_vdevs_started)
finally the requestor would be blocked until wait time out.
The same applies to WMI_REQUEST_BCN_STAT request as well due to:
((++ar->fw_stats.num_bcn_recvd) == ar->num_started_vdevs)
Change to check the number of started vdev first: if it is zero, finish
wait directly; if not, follow the old way.
Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.37
Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices")
Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20250220082448.31039-4-quic_bqiang@quicinc.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
|
|
Currently ath11k_debugfs_fw_stats_process() is using static variables to count
firmware stat events. Taking num_vdev as an example, if for whatever reason (
say ar->num_started_vdevs is 0 or firmware bug etc.) the following condition
(++num_vdev) == total_vdevs_started
is not met, is_end is not set thus num_vdev won't be cleared. Next time when
firmware stats is requested again, even if everything is working fine, we will
fail due to the condition above will never be satisfied.
The same applies to num_bcn as well.
Change to use non-static counters so that we have a chance to clear them each
time firmware stats is requested. Currently only ath11k_fw_stats_request() and
ath11k_debugfs_fw_stats_request() are requesting firmware stats, so clear
counters there.
Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.37
Fixes: da3a9d3c1576 ("ath11k: refactor debugfs code into debugfs.c")
Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Acked-by: Kalle Valo <kvalo@kernel.org>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20250220082448.31039-3-quic_bqiang@quicinc.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
|
|
We get report [1] that CPU is running a hot loop in
ath11k_debugfs_fw_stats_request():
94.60% 0.00% i3status [kernel.kallsyms] [k] do_syscall_64
|
--94.60%--do_syscall_64
|
--94.55%--__sys_sendmsg
___sys_sendmsg
____sys_sendmsg
netlink_sendmsg
netlink_unicast
genl_rcv
netlink_rcv_skb
genl_rcv_msg
|
--94.55%--genl_family_rcv_msg_dumpit
__netlink_dump_start
netlink_dump
genl_dumpit
nl80211_dump_station
|
--94.55%--ieee80211_dump_station
sta_set_sinfo
|
--94.55%--ath11k_mac_op_sta_statistics
ath11k_debugfs_get_fw_stats
|
--94.55%--ath11k_debugfs_fw_stats_request
|
|--41.73%--_raw_spin_lock_bh
|
|--22.74%--__local_bh_enable_ip
|
|--9.22%--_raw_spin_unlock_bh
|
--6.66%--srso_alias_safe_ret
This is because, if for whatever reason ar->fw_stats_done is not set by
ath11k_update_stats_event(), ath11k_debugfs_fw_stats_request() won't yield
CPU before an up to 3s timeout.
Change to completion mechanism to avoid CPU burning.
Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.37
Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices")
Reported-by: Yury Vostrikov <mon@unformed.ru>
Closes: https://lore.kernel.org/all/7324ac7a-8b7a-42a5-aa19-de52138ff638@app.fastmail.com/ # [1]
Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Reviewed-by: Vasanthakumar Thiagarajan <vasanthakumar.thiagarajan@oss.qualcomm.com>
Link: https://patch.msgid.link/20250220082448.31039-2-quic_bqiang@quicinc.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
|
|
the wil6210 driver irq handling code is unconditionally writing
edma irq registers which are supposed to be only used on Talyn chipsets.
This however leade to a chipset hang on the older sparrow chipset
generation and firmware will not even boot.
Fix that by simply checking for edma support before handling these
registers.
Tested on Netgear R9000
Signed-off-by: Sebastian Gottschall <s.gottschall@dd-wrt.com>
Link: https://patch.msgid.link/20250304012131.25970-2-s.gottschall@dd-wrt.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
|
|
In some scenarios, the firmware may be stopped before the interface is
removed, either due to a crash or because the remoteproc (e.g., MPSS)
is shut down early during system reboot or shutdown.
This leads to a delay during interface teardown, as the driver waits for
a vdev delete response that never arrives, eventually timing out.
Example (SNOC):
$ echo stop > /sys/class/remoteproc/remoteproc0/state
[ 71.64] remoteproc remoteproc0: stopped remote processor modem
$ reboot
[ 74.84] ath10k_snoc c800000.wifi: failed to transmit packet, dropping: -108
[ 74.84] ath10k_snoc c800000.wifi: failed to submit frame: -108
[...]
[ 82.39] ath10k_snoc c800000.wifi: Timeout in receiving vdev delete response
To avoid this, skip waiting for the vdev delete response if the firmware is
already marked as unreachable (`ATH10K_FLAG_CRASH_FLUSH`), similar to how
`ath10k_mac_wait_tx_complete()` and `ath10k_vdev_setup_sync()` handle this case.
Signed-off-by: Loic Poulain <loic.poulain@oss.qualcomm.com>
Link: https://patch.msgid.link/20250522131704.612206-1-loic.poulain@oss.qualcomm.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
|
|
In ath10k_snoc_hif_stop() we skip disabling the IRQs in the crash
recovery flow, but we still unconditionally call enable again in
ath10k_snoc_hif_start().
We can't check the ATH10K_FLAG_CRASH_FLUSH bit since it is cleared
before hif_start() is called, so instead check the
ATH10K_SNOC_FLAG_RECOVERY flag and skip enabling the IRQs during crash
recovery.
This fixes unbalanced IRQ enable splats that happen after recovering from
a crash.
Fixes: 0e622f67e041 ("ath10k: add support for WCN3990 firmware crash recovery")
Signed-off-by: Caleb Connolly <caleb.connolly@linaro.org>
Tested-by: Loic Poulain <loic.poulain@oss.qualcomm.com>
Link: https://patch.msgid.link/20250318205043.1043148-1-caleb.connolly@linaro.org
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
|
|
Make acpi_processor_driver_init() call arch_cpu_rescan_dead_smt_siblings(),
via a new wrapper function called acpi_idle_rescan_dead_smt_siblings(),
after successfully initializing the driver, to allow the "dead" SMT
siblings to go into deep idle states, which is necessary for the
processor to be able to reach deep package C-states (like PC10) going
forward, so that power can be reduced sufficiently in suspend-to-idle,
among other things.
However, do it only if the ACPI idle driver is the current cpuidle
driver (otherwise it is assumed that another cpuidle driver will take
care of this) and avoid doing it on architectures other than x86.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Link: https://patch.msgid.link/2005721.PYKUYFuaPT@rjwysocki.net
|
|
Make intel_idle_init() call arch_cpu_rescan_dead_smt_siblings() after
successfully registering intel_idle as the cpuidle driver so as to
allow the "dead" SMT siblings (if any) to go into deep idle states.
This is necessary for the processor to be able to reach deep package
C-states (like PC10) going forward which is requisite for reducing
power sufficiently in suspend-to-idle, among other things.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Link: https://patch.msgid.link/10669885.nUPlyArG6x@rjwysocki.net
|
|
Without correct unregisteration, ACPI notify handlers and the platform
drivers installed by generic_subdriver_init() will become dangling
references after removing the loongson_laptop module, triggering various
kernel faults when a hotkey is sent or at kernel shutdown.
Cc: stable@vger.kernel.org
Fixes: 6246ed09111f ("LoongArch: Add ACPI-based generic laptop driver")
Signed-off-by: Yao Zi <ziyao@disroot.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
loongson_laptop_turn_{on,off}_backlight() are designed for controlling
the power of the backlight, but they aren't really used in the driver
previously.
Unify these two functions since they only differ in arguments passed to
ACPI method, and wire up loongson_laptop_backlight_update() to update
the power state of the backlight as well. Tested on the TongFang L860-T2
Loongson-3A5000 laptop.
Cc: stable@vger.kernel.org
Fixes: 6246ed09111f ("LoongArch: Add ACPI-based generic laptop driver")
Signed-off-by: Yao Zi <ziyao@disroot.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
Pull SCSI fixes from James Bottomley:
"Mostly trivial updates and bug fixes (core update is a comment
spelling fix).
The bigger UFS update is the clock scaling and frequency fixes"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: qcom: Prevent calling phy_exit() before phy_init()
scsi: ufs: qcom: Call ufs_qcom_cfg_timers() in clock scaling path
scsi: ufs: qcom: Map devfreq OPP freq to UniPro Core Clock freq
scsi: ufs: qcom: Check gear against max gear in vop freq_to_gear()
scsi: aacraid: Remove useless code
scsi: core: devinfo: Fix typo in comment
scsi: ufs: core: Don't perform UFS clkscaling during host async scan
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull more s390 updates from Heiko Carstens:
- Add missing select CRYPTO_ENGINE to CRYPTO_PAES_S390
- Fix secure storage access exception handling when fault handling is
disabled
* tag 's390-6.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/mm: Fix in_atomic() handling in do_secure_storage_access()
s390/crypto: Select crypto engine in Kconfig when PAES is chosen
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull more spi updates from Mark Brown:
"A small set of updates that came in during the merge window, we've
got:
- Some small fixes for the Broadcom and spi-pci1xxxx drivers
- A change to the QPIC SNAND driver to flag that the error correction
features are less useful than people might be expecting
- A new device ID for the SOPHGO SG2042
- The addition of Yang Shen as a Huawei maintainer"
* tag 'spi-v6.16-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: spi-qpic-snand: document the limited bit error reporting capability
spi: bcm63xx-hsspi: fix shared reset
spi: bcm63xx-spi: fix shared reset
MAINTAINERS: Update HiSilicon SFC driver maintainer
MAINTAINERS: Update HiSilicon SPI Controller driver maintainer
spi: dt-bindings: spi-sg2044-nor: Add SOPHGO SG2042
spi: spi-pci1xxxx: Fix Probe failure with Dual SPI instance with INTx interrupts
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fix from Mark Brown:
"A very minor fix that came in during the merge window, checking for
I/O errors in the MAX14577 driver"
* tag 'regulator-fix-v6.16-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: max14577: Add error check for max14577_read_reg()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux
Pull pwm fixes from Uwe Kleine-König:
"axi-pwmgen: Fix handling of external clock
The pwm-axi-pwmgen device is backed by an FPGA and can be synthesized
in different ways. Relevant here is that it can use one or two
external clock signals. These fix clock handling for the two clocks
case"
* tag 'pwm/for-6.16-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux:
pwm: axi-pwmgen: fix missing separate external clock
dt-bindings: pwm: adi,axi-pwmgen: Fix clocks
|
|
Pull more block updates from Jens Axboe:
- NVMe pull request via Christoph:
- TCP error handling fix (Shin'ichiro Kawasaki)
- TCP I/O stall handling fixes (Hannes Reinecke)
- fix command limits status code (Keith Busch)
- support vectored buffers also for passthrough (Pavel Begunkov)
- spelling fixes (Yi Zhang)
- MD pull request via Yu:
- fix REQ_RAHEAD and REQ_NOWAIT IO err handling for raid1/10
- fix max_write_behind setting for dm-raid
- some minor cleanups
- Integrity data direction fix and cleanup
- bcache NULL pointer fix
- Fix for loop missing write start/end handling
- Decouple hardware queues and IO threads in ublk
- Slew of ublk selftests additions and updates
* tag 'block-6.16-20250606' of git://git.kernel.dk/linux: (29 commits)
nvme: spelling fixes
nvme-tcp: fix I/O stalls on congested sockets
nvme-tcp: sanitize request list handling
nvme-tcp: remove tag set when second admin queue config fails
nvme: enable vectored registered bufs for passthrough cmds
nvme: fix implicit bool to flags conversion
nvme: fix command limits status code
selftests: ublk: kublk: improve behavior on init failure
block: flip iter directions in blk_rq_integrity_map_user()
block: drop direction param from bio_integrity_copy_user()
selftests: ublk: cover PER_IO_DAEMON in more stress tests
Documentation: ublk: document UBLK_F_PER_IO_DAEMON
selftests: ublk: add stress test for per io daemons
selftests: ublk: add functional test for per io daemons
selftests: ublk: kublk: decouple ublk_queues from ublk server threads
selftests: ublk: kublk: move per-thread data out of ublk_queue
selftests: ublk: kublk: lift queue initialization out of thread
selftests: ublk: kublk: tie sqe allocation to io instead of queue
selftests: ublk: kublk: plumb q_id in io_uring user_data
ublk: have a per-io daemon instead of a per-queue daemon
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt updates from Greg KH:
"Here is the big set of USB and Thunderbolt changes for 6.16-rc1.
Included in here are the following:
- USB offload support for audio devices.
I think this takes the record for the most number of patch series
(30+) over the longest period of time (2+ years) to get merged
properly.
Many props go to Wesley Cheng for seeing this effort through, they
took a major out-of-tree hacked-up-monstrosity that was created by
multiple vendors for their specific devices, got it all merged into
a semi-coherent set of changes, and got all of the different major
subsystems to agree on how this should be implemented both with
changes to their code as well as userspace apis, AND wrangled the
hardware companies into agreeing to go forward with this, despite
making them all redo work they had already done in their private
device trees.
This feature offers major power savings on embedded devices where a
USB audio stream can continue to flow while the rest of the system
is sleeping, something that devices running on battery power really
care about. There are still some more small tweaks left to be done
here, and those patches are still out for review and arguing among
the different hardware companies, but this is a major step forward
and a great example of how to do upstream development well.
- small number of thunderbolt fixes and updates, things seem to be
slowing down here (famous last words...)
- xhci refactors and reworking to try to handle some rough corner
cases in some hardware implementations where things don't always
work properly
- typec driver updates
- USB3 power management reworking and updates
- Removal of some old and orphaned UDC gadget drivers that had not
been used in a very long time, dropping over 11 thousand lines from
the tree, always a nice thing, making up for the 12k lines added
for the USB offload feature.
- lots of little updates and fixes in different drivers
All of these have been in linux-next for over 2 weeks, the USB offload
logic has been in there for 8 weeks now, with no reported issues"
* tag 'usb-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (172 commits)
ALSA: usb-audio: qcom: fix USB_XHCI dependency
ASoC: qdsp6: fix compile-testing without CONFIG_OF
usb: misc: onboard_usb_dev: fix build warning for CONFIG_USB_ONBOARD_DEV_USB5744=n
usb: typec: tipd: fix typo in TPS_STATUS_HIGH_VOLAGE_WARNING macro
USB: typec: fix const issue in typec_match()
USB: gadget: udc: fix const issue in gadget_match_driver()
USB: gadget: fix up const issue with struct usb_function_instance
USB: serial: pl2303: add new chip PL2303GC-Q20 and PL2303GT-2AB
USB: serial: bus: fix const issue in usb_serial_device_match()
usb: usbtmc: Fix timeout value in get_stb
usb: usbtmc: Fix read_stb function and get_stb ioctl
ALSA: qc_audio_offload: try to reduce address space confusion
ALSA: qc_audio_offload: avoid leaking xfer_buf allocation
ALSA: qc_audio_offload: rename dma/iova/va/cpu/phys variables
ALSA: usb-audio: qcom: Fix an error handling path in qc_usb_audio_probe()
usb: misc: onboard_usb_dev: Fix usb5744 initialization sequence
dt-bindings: usb: ti,usb8041: Add binding for TI USB8044 hub controller
usb: misc: onboard_usb_dev: Add support for TI TUSB8044 hub
usb: gadget: lpc32xx_udc: Use USB API functions rather than constants
usb: gadget: epautoconf: Use USB API functions rather than constants
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial updates from Greg KH:
"Here is the big set of tty and serial driver changes for 6.16-rc1.
A little more churn than normal in this portion of the kernel for this
development cycle, Jiri and Nicholas were busy with cleanups and
reviews and fixes for the vt unicode handling logic which composed
most of the overall work in here.
Major changes are:
- vt unicode changes/reverts/changes from Nicholas. This should help
out a lot with screen readers and others that rely on vt console
support
- lock guard additions to the core tty/serial code to clean up lots
of error handling logic
- 8250 driver updates and fixes
- device tree conversions to yaml
- sh-sci driver updates
- other small cleanups and updates for serial drivers and tty core
portions
All of these have been in linux-next for 2 weeks with no reported issues"
* tag 'tty-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (105 commits)
tty: serial: 8250_omap: fix TX with DMA for am33xx
vt: add VT_GETCONSIZECSRPOS to retrieve console size and cursor position
vt: bracketed paste support
vt: remove VT_RESIZE and VT_RESIZEX from vt_compat_ioctl()
vt: process the full-width ASCII fallback range programmatically
vt: make use of ucs_get_fallback() when glyph is unavailable
vt: add ucs_get_fallback()
vt: create ucs_fallback_table.h_shipped with gen_ucs_fallback_table.py
vt: introduce gen_ucs_fallback_table.py to create ucs_fallback_table.h
vt: move glyph determination to a separate function
vt: make sure displayed double-width characters are remembered as such
vt: ucs.c: fix misappropriate in_range() usage
serial: max3100: Replace open-coded parity calculation with parity8()
dt-bindings: serial: 8250_omap: Drop redundant properties
dt-bindings: serial: Convert socionext,milbeaut-usio-uart to DT schema
dt-bindings: serial: Convert microchip,pic32mzda-uart to DT schema
dt-bindings: serial: Convert arm,sbsa-uart to DT schema
dt-bindings: serial: Convert snps,arc-uart to DT schema
dt-bindings: serial: Convert marvell,armada-3700-uart to DT schema
dt-bindings: serial: Convert lantiq,asc to DT schema
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char / misc / iio driver updates from Greg KH:
"Here is the big char/misc/iio and other small driver subsystem pull
request for 6.16-rc1.
Overall, a lot of individual changes, but nothing major, just the
normal constant forward progress of new device support and cleanups to
existing subsystems. Highlights in here are:
- Large IIO driver updates and additions and device tree changes
- Android binder bugfixes and logfile fixes
- mhi driver updates
- comedi driver updates
- counter driver updates and additions
- coresight driver updates and additions
- echo driver removal as there are no in-kernel users of it
- nvmem driver updates
- spmi driver updates
- new amd-sbi driver "subsystem" and drivers added
- rust miscdriver binding documentation fix
- other small driver fixes and updates (uio, w1, acrn, hpet,
xillybus, cardreader drivers, fastrpc and others)
All of these have been in linux-next for quite a while with no
reported problems"
* tag 'char-misc-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (390 commits)
binder: fix yet another UAF in binder_devices
counter: microchip-tcb-capture: Add watch validation support
dt-bindings: iio: adc: Add ROHM BD79100G
iio: adc: add support for Nuvoton NCT7201
dt-bindings: iio: adc: add NCT7201 ADCs
iio: chemical: Add driver for SEN0322
dt-bindings: trivial-devices: Document SEN0322
iio: adc: ad7768-1: reorganize driver headers
iio: bmp280: zero-init buffer
iio: ssp_sensors: optimalize -> optimize
HID: sensor-hub: Fix typo and improve documentation
iio: admv1013: replace redundant ternary operator with just len
iio: chemical: mhz19b: Fix error code in probe()
iio: adc: at91-sama5d2: use IIO_DECLARE_BUFFER_WITH_TS
iio: accel: sca3300: use IIO_DECLARE_BUFFER_WITH_TS
iio: adc: ad7380: use IIO_DECLARE_DMA_BUFFER_WITH_TS
iio: adc: ad4695: rename AD4695_MAX_VIN_CHANNELS
iio: adc: ad4695: use IIO_DECLARE_DMA_BUFFER_WITH_TS
iio: introduce IIO_DECLARE_BUFFER_WITH_TS macros
iio: make IIO_DMA_MINALIGN minimum of 8 bytes
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver updates from Greg KH:
"Here is the "big" set of staging driver changes for 6.16-rc1. Included
in here are:
- gpib driver cleanups and updates.
This subsystem is _almost_ ready to be merged into the main portion
of the kernel tree. Hopefully should happen in the next kernel
merge cycle if all goes well.
- sm750fb driver cleanups
- rtl8723bs driver cleanups
- other small driver cleanups for coding style issues
All of these have been in for over 2 weeks with no reported issues"
* tag 'staging-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (203 commits)
staging: rtl8723bs: remove unnecessary braces for single statement blocks
staging: rtl8723bs: Removed multiple blank lines of rtw_pwrctrl.c
staging: sm750fb: rename `hw_sm750le_setBLANK`
staging: sm750fb: rename `hw_sm750_setBLANK`
staging: sm750fb: rename `hw_sm750_setColReg`
staging: sm750fb: rename `hw_sm750_crtc_setMode`
staging: sm750fb: rename `hw_sm750_crtc_checkMode`
staging: sm750fb: rename `hw_sm750_output_setMode`
staging: sm750fb: rename `hw_sm750le_deWait`
staging: sm750fb: rename `hw_sm750_deWait`
staging: sm750fb: rename `hw_sm750_initAccel`
staging: gpib: switch to kmalloc(sizeof(*status))
staging: gpib: Fix secondary address restriction
staging: gpib: Fix uapi include header guard name
staging: gpib: Avoid unused variable warning
staging: gpib: Declare driver entry points static
staging: gpib: Fix PCMCIA config identifier
staging: gpib: Avoid unused variable warnings
staging: gpib: Fix lpvo request_system_control
staging: sm750fb: rename sm750_hw_cursor_setData2
...
|
|
Pull more drm fixes from Simona Vetter:
"Another small batch of drm fixes, this time with a different baseline
and hence separate.
Drivers:
- ivpu:
- dma_resv locking
- warning fixes
- reset failure handling
- improve logging
- update fw file names
- fix cmdqueue unregister
- panel-simple: add Evervision VGG644804
Core Changes:
- sysfb: screen_info type check
- video: screen_info for relocated pci fb
- drm/sched: signal fence of killed job
- dummycon: deferred takeover fix"
* tag 'drm-fixes-2025-06-06' of https://gitlab.freedesktop.org/drm/kernel:
sysfb: Fix screen_info type check for VGA
video: screen_info: Relocate framebuffers behind PCI bridges
accel/ivpu: Fix warning in ivpu_gem_bo_free()
accel/ivpu: Trigger device recovery on engine reset/resume failure
accel/ivpu: Use dma_resv_lock() instead of a custom mutex
drm/panel-simple: fix the warnings for the Evervision VGG644804
accel/ivpu: Reorder Doorbell Unregister and Command Queue Destruction
accel/ivpu: Use firmware names from upstream repo
accel/ivpu: Improve buffer object logging
dummycon: Trigger redraw when switching consoles with deferred takeover
drm/scheduler: signal scheduled fence when kill job
|
|
It is not necessary to wait until the device_initcall() stage with
intel_idle initialization. All of its dependencies are met after
all subsys_initcall()s have run, so subsys_initcall_sync() can be
used for initializing it.
It is also better to ensure that intel_idle will always initialize
before the ACPI processor driver that uses module_init() for its
initialization.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Link: https://patch.msgid.link/2994397.e9J7NaK4W3@rjwysocki.net
|
|
Pull drm fixes from Dave Airlie:
"This is pretty much two weeks worth of fixes, plus one thing that
might be considered next: amdkfd is now able to be enabled on risc-v
platforms.
Otherwise, amdgpu and xe with the majority of fixes, and then a
smattering all over.
panel:
- nt37801: fix IS_ERR
- nt37801: fix KConfig
connector:
- Fix null deref in HDMI audio helper.
bridge:
- analogix_dp: fixup clk-disable removal
nouveau:
- minor typo fix (',' vs ';')
msm:
- mailmap updates
i915:
- Fix the enabling/disabling of DP audio SDP splitting
- Fix PSR register definitions for ALPM
- Fix u32 overflow in SNPS PHY HDMI PLL setup
- Fix GuC pending message underflow when submit fails
- Fix GuC wakeref underflow race during reset
xe:
- Two documentation fixes
- A couple of vm init fixes
- Hwmon fixes
- Drop reduntant conversion to bool
- Fix CONFIG_INTEL_VSEC dependency
- Rework eviction rejection of bound external bos
- Stop re-submitting signalled jobs
- A couple of pxp fixes
- Add back a fix that got lost in a merge
- Create LRC bo without VM
- Fix for the above fix
amdgpu:
- UserQ fixes
- SMU 13.x fixes
- VCN fixes
- JPEG fixes
- Misc cleanups
- runtime pm fix
- DCN 4.0.1 fixes
- Misc display fixes
- ISP fix
- VRAM manager fix
- RAS fixes
- IP discovery fix
- Cleaner shader fix for GC 10.1.x
- OD fix
- Non-OLED panel fix
- Misc display fixes
- Brightness fixes
amdkfd:
- Enable CONFIG_HSA_AMD on RISCV
- SVM fix
- Misc cleanups
- Ref leak fix
- WPTR BO fix
radeon:
- Misc cleanups"
* tag 'drm-next-2025-06-06' of https://gitlab.freedesktop.org/drm/kernel: (105 commits)
drm/nouveau/vfn/r535: Convert comma to semicolon
drm/xe: remove unmatched xe_vm_unlock() from __xe_exec_queue_init()
drm/xe: Create LRC BO without VM
drm/xe/guc_submit: add back fix
drm/xe/pxp: Clarify PXP queue creation behavior if PXP is not ready
drm/xe/pxp: Use the correct define in the set_property_funcs array
drm/xe/sched: stop re-submitting signalled jobs
drm/xe: Rework eviction rejection of bound external bos
drm/xe/vsec: fix CONFIG_INTEL_VSEC dependency
drm/xe: drop redundant conversion to bool
drm/xe/hwmon: Move card reactive critical power under channel card
drm/xe/hwmon: Add support to manage power limits though mailbox
drm/xe/vm: move xe_svm_init() earlier
drm/xe/vm: move rebind_work init earlier
MAINTAINERS: .mailmap: update Rob Clark's email address
mailmap: Update entry for Akhil P Oommen
MAINTAINERS: update my email address
MAINTAINERS: drop myself as maintainer
drm/i915/display: Fix u32 overflow in SNPS PHY HDMI PLL setup
drm/amd/display: Fix default DC and AC levels
...
|
|
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Short summary of fixes pull:
ivpu:
- gem: Use dma-resv lock
- gem. Fix a warning
- Trigger recovery on device engine reset/resume failure
panel:
- panel-simple: Fix settings for Evervision VGG644804
sysfb:
- Fix screen_info type check
video:
- Update screen_info for relocated PCI framebuffers
Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20250606072853.GA13099@linux.fritz.box
|
|
The kernel robot reported the following errors when the netc-lib driver
was compiled as a loadable module and the enetc-core driver was built-in.
ld.lld: error: undefined symbol: ntmp_init_cbdr
referenced by enetc_cbdr.c:88 (drivers/net/ethernet/freescale/enetc/enetc_cbdr.c:88)
ld.lld: error: undefined symbol: ntmp_free_cbdr
referenced by enetc_cbdr.c:96 (drivers/net/ethernet/freescale/enetc/enetc_cbdr.c:96)
Simply changing "tristate" to "bool" can fix this issue, but considering
that the netc-lib driver needs to support being compiled as a loadable
module and LS1028 does not need the netc-lib driver. Therefore, we add a
boolean symbol 'NXP_NTMP' to enable 'NXP_NETC_LIB' as needed. And when
adding NETC switch driver support in the future, there is no need to
modify the dependency, just select "NXP_NTMP" and "NXP_NETC_LIB" at the
same time.
Reported-by: Arnd Bergmann <arnd@kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202505220734.x6TF6oHR-lkp@intel.com/
Fixes: 4701073c3deb ("net: enetc: add initial netc-lib driver to support NTMP")
Suggested-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Previously during driver probe, 1 is unconditionally taken as current
brightness value and set to props.brightness, which will be considered
as the brightness before suspend and restored to EC on resume. Since a
brightness value of 1 almost never matches EC's state on coldboot (my
laptop's EC defaults to 80), this causes surprising changes of screen
brightness on the first time of resume after coldboot.
Let's get brightness from EC and take it as the current brightness on
probe of the laptop driver to avoid the surprising behavior. Tested on
TongFang L860-T2 Loongson-3A5000 laptop.
Cc: stable@vger.kernel.org
Fixes: 6246ed09111f ("LoongArch: Add ACPI-based generic laptop driver")
Signed-off-by: Yao Zi <ziyao@disroot.org>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
|
|
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Short summary of fixes pull:
drm-scheduler:
- signal scheduled fence when killing job
dummycon:
- trigger deferred takeover when switching consoles
ivpu:
- improve logging
- update firmware filenames
- reorder steps in command-queue unregistering
Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20250528153550.GA21050@linux.fritz.box
|
|
Replace comma between expressions with semicolons.
Using a ',' in place of a ';' can have unintended side effects.
Although that is not the case here, it is seems best to use ';'
unless ',' is intended.
Found by inspection.
No functional change intended.
Compile tested only.
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Fixes: cd3c62282b61 ("drm/nouveau/gsp: add usermode class id to gpu hal")
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://lore.kernel.org/r/20250603061027.1310267-1-nichen@iscas.ac.cn
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-fixes-6.16-2025-06-05:
amdgpu:
- IP discovery fix
- Cleaner shader fix for GC 10.1.x
- OD fix
- UserQ fixes
- Non-OLED panel fix
- Misc display fixes
- Brightness fixes
amdkfd:
- Enable CONFIG_HSA_AMD on RISCV
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250606015932.835829-1-alexander.deucher@amd.com
|
|
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next
Driver Changes:
- A couple of vm init fixes (Matt Auld)
- Hwmon fixes (Karthik)
- Drop reduntant conversion to bool (Raag)
- Fix CONFIG_INTEL_VSEC dependency (Arnd)
- Rework eviction rejection of bound external bos (Thomas)
- Stop re-submitting signalled jobs (Matt Auld)
- A couple of pxp fixes (Daniele)
- Add back a fix that got lost in a merge (Matt Auld)
- Create LRC bo without VM (Niranjana)
- Fix for the above fix (Maciej)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://lore.kernel.org/r/aEHq44uIAZwfK-mG@fedora
|
|
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
drm-misc-fixes for v6.16-rc1:
- Fixes for nt37801 panel
- Fix null deref in HDMI audio helper.
- Fixes for analogix_dp.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://lore.kernel.org/r/14c2eff8-701d-4699-b187-08862715e1ac@linux.intel.com
|
|
There is no disagreement that we should check both ptp->is_virtual_clock
and ptp->n_vclocks to check if the ptp virtual clock is in use.
However, when we acquire ptp->n_vclocks_mux to read ptp->n_vclocks in
ptp_vclock_in_use(), we observe a recursive lock in the call trace
starting from n_vclocks_store().
============================================
WARNING: possible recursive locking detected
6.15.0-rc6 #1 Not tainted
--------------------------------------------
syz.0.1540/13807 is trying to acquire lock:
ffff888035a24868 (&ptp->n_vclocks_mux){+.+.}-{4:4}, at:
ptp_vclock_in_use drivers/ptp/ptp_private.h:103 [inline]
ffff888035a24868 (&ptp->n_vclocks_mux){+.+.}-{4:4}, at:
ptp_clock_unregister+0x21/0x250 drivers/ptp/ptp_clock.c:415
but task is already holding lock:
ffff888030704868 (&ptp->n_vclocks_mux){+.+.}-{4:4}, at:
n_vclocks_store+0xf1/0x6d0 drivers/ptp/ptp_sysfs.c:215
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&ptp->n_vclocks_mux);
lock(&ptp->n_vclocks_mux);
*** DEADLOCK ***
....
============================================
The best way to solve this is to remove the logic that checks
ptp->n_vclocks in ptp_vclock_in_use().
The reason why this is appropriate is that any path that uses
ptp->n_vclocks must unconditionally check if ptp->n_vclocks is greater
than 0 before unregistering vclocks, and all functions are already
written this way. And in the function that uses ptp->n_vclocks, we
already get ptp->n_vclocks_mux before unregistering vclocks.
Therefore, we need to remove the redundant check for ptp->n_vclocks in
ptp_vclock_in_use() to prevent recursive locking.
Fixes: 73f37068d540 ("ptp: support ptp physical/virtual clocks conversion")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Link: https://patch.msgid.link/20250520160717.7350-1-aha310510@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When Linux sends out untagged traffic from a port, it will enter the CPU
port without any VLAN tag, even if the port is a member of a vlan
filtering bridge with a PVID egress untagged VLAN.
This makes the CPU port's PVID take effect, and the PVID's VLAN
table entry controls if the packet will be tagged on egress.
Since commit 45e9d59d3950 ("net: dsa: b53: do not allow to configure
VLAN 0") we remove bridged ports from VLAN 0 when joining or leaving a
VLAN aware bridge. But we also clear the untagged bit, causing untagged
traffic from the controller to become tagged with VID 0 (and priority
0).
Fix this by not touching the untagged map of VLAN 0. Additionally,
always keep the CPU port as a member, as the untag map is only effective
as long as there is at least one member, and we would remove it when
bridging all ports and leaving no standalone ports.
Since Linux (and the switch) treats VLAN 0 tagged traffic like untagged,
the actual impact of this is rather low, but this also prevented earlier
detection of the issue.
Fixes: 45e9d59d3950 ("net: dsa: b53: do not allow to configure VLAN 0")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/20250602194914.1011890-1-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
- Fix PSR register definitions for ALPM
- Fix u32 overflow in SNPS PHY HDMI PLL setup
- Fix GuC pending message underflow when submit fails
- Fix GuC wakeref underflow race during reset
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://lore.kernel.org/r/aEFW1wGnt1kTVNGF@jlahtine-mobl
|
|
These objects are built as prerequisites of %.stub.o files.
There is no need to use extra-y, which is planned for deprecation.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"Fix three issues introduced into device suspend/resume error paths in
the PM core by some of the recent updates.
First off, replace list_splice() with list_splice_init() in three
places in device suspend error paths to avoid attempting to use an
uninitialized list head going forward.
Second, rearrange device_resume() to avoid leaking the
power.is_suspended device PM flag to the next system suspend/resume
cycle where it can confuse rolling back after an error or early
wakeup.
Finally, add synchronization to dpm_async_resume_children() to avoid
resetting the async state mistakenly for devices whose resume
callbacks have already been queued up for asynchronous execution in
the given device resume phase, which fortunately can happen only if
the preceding system suspend transition has been aborted"
* tag 'pm-6.16-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: sleep: Add locking to dpm_async_resume_children()
PM: sleep: Fix power.is_suspended cleanup for direct-complete devices
PM: sleep: Fix list splicing in device suspend error paths
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from CAN, wireless, Bluetooth, and Netfilter.
Current release - regressions:
- Revert "kunit: configs: Enable CONFIG_INIT_STACK_ALL_PATTERN in
all_tests", makes kunit error out if compiler is old
- wifi: iwlwifi: mvm: fix assert on suspend
- rxrpc: fix return from none_validate_challenge()
Current release - new code bugs:
- ovpn: couple of fixes for socket cleanup and UDP-tunnel teardown
- can: kvaser_pciefd: refine error prone echo_skb_max handling logic
- fix net_devmem_bind_dmabuf() stub when DEVMEM not compiled
- eth: airoha: fixes for config / accel in bridge mode
Previous releases - regressions:
- Bluetooth: hci_qca: move the SoC type check to the right place, fix
GPIO integration
- prevent a NULL deref in rtnl_create_link() after locking changes
- fix udp gso skb_segment after pull from frag_list
- hv_netvsc: fix potential deadlock in netvsc_vf_setxdp()
Previous releases - always broken:
- netfilter:
- nf_nat: also check reverse tuple to obtain clashing entry
- nf_set_pipapo_avx2: fix initial map fill (zeroing)
- fix the helper for incremental update of packet checksums after
modifying the IP address, used by ILA and BPF
- eth:
- stmmac: prevent div by 0 when clock rate is misconfigured
- ice: fix Tx scheduler handling of XDP and changing queue count
- eth: fix support for the RGMII interface when delays configured"
* tag 'net-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (76 commits)
calipso: unlock rcu before returning -EAFNOSUPPORT
seg6: Fix validation of nexthop addresses
net: prevent a NULL deref in rtnl_create_link()
net: annotate data-races around cleanup_net_task
selftests: drv-net: tso: make bkg() wait for socat to quit
selftests: drv-net: tso: fix the GRE device name
selftests: drv-net: add configs for the TSO test
wireguard: device: enable threaded NAPI
netlink: specs: rt-link: decode ip6gre
netlink: specs: rt-link: add missing byte-order properties
net: wwan: mhi_wwan_mbim: use correct mux_id for multiplexing
wifi: cfg80211/mac80211: correctly parse S1G beacon optional elements
net: dsa: b53: do not touch DLL_IQQD on bcm53115
net: dsa: b53: allow RGMII for bcm63xx RGMII ports
net: dsa: b53: do not configure bcm63xx's IMP port interface
net: dsa: b53: do not enable RGMII delay on bcm63xx
net: dsa: b53: do not enable EEE on bcm63xx
net: ti: icssg-prueth: Fix swapped TX stats for MII interfaces.
selftests: netfilter: nft_nat.sh: add test for reverse clash with nat
netfilter: nf_nat: also check reverse tuple to obtain clashing entry
...
|
|
Modify the driver to post 3 fewer buffers than the maximum rx buffers
(64) allowed for the firmware. This change mitigates a hardware issue
causing a race condition in the firmware, improving stability and data
handling.
Signed-off-by: Chandrashekar Devegowda <chandrashekar.devegowda@intel.com>
Signed-off-by: Kiran K <kiran.k@intel.com>
Fixes: c2b636b3f788 ("Bluetooth: btintel_pcie: Add support for PCIe transport")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
This change addresses latency issues observed in HID use cases where
events arrive in bursts. By increasing the Rx descriptor count to 64,
the firmware can handle bursty data more effectively, reducing latency
and preventing buffer overflows.
Signed-off-by: Chandrashekar Devegowda <chandrashekar.devegowda@intel.com>
Signed-off-by: Kiran K <kiran.k@intel.com>
Fixes: c2b636b3f788 ("Bluetooth: btintel_pcie: Add support for PCIe transport")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
The driver was posting only 6 rx buffers, despite the maximum rx buffers
being defined as 16. Having fewer RX buffers caused firmware exceptions
in HID use cases when events arrived in bursts.
Exception seen on android 6.12 kernel.
E Bluetooth: hci0: Received hw exception interrupt
E Bluetooth: hci0: Received gp1 mailbox interrupt
D Bluetooth: hci0: 00000000: ff 3e 87 80 03 01 01 01 03 01 0c 0d 02 1c 10 0e
D Bluetooth: hci0: 00000010: 01 00 05 14 66 b0 28 b0 c0 b0 28 b0 ac af 28 b0
D Bluetooth: hci0: 00000020: 14 f1 28 b0 00 00 00 00 fa 04 00 00 00 00 40 10
D Bluetooth: hci0: 00000030: 08 00 00 00 7a 7a 7a 7a 47 00 fb a0 10 00 00 00
D Bluetooth: hci0: 00000000: 10 01 0a
E Bluetooth: hci0: ---- Dump of debug registers —
E Bluetooth: hci0: boot stage: 0xe0fb0047
E Bluetooth: hci0: ipc status: 0x00000004
E Bluetooth: hci0: ipc control: 0x00000000
E Bluetooth: hci0: ipc sleep control: 0x00000000
E Bluetooth: hci0: mbox_1: 0x00badbad
E Bluetooth: hci0: mbox_2: 0x0000101c
E Bluetooth: hci0: mbox_3: 0x00000008
E Bluetooth: hci0: mbox_4: 0x7a7a7a7a
Signed-off-by: Chandrashekar Devegowda <chandrashekar.devegowda@intel.com>
Signed-off-by: Kiran K <kiran.k@intel.com>
Fixes: c2b636b3f788 ("Bluetooth: btintel_pcie: Add support for PCIe transport")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
There is unmatched xe_vm_unlock() in the __xe_exec_queue_init().
Leftover from commit fbeaad071a98 ("drm/xe: Create LRC BO without VM")
Fixes: 2b0a0ce0c20b ("drm/xe: Create LRC BO without VM")
Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Link: https://lore.kernel.org/r/20250530135627.2821612-1-maciej.patelczyk@intel.com
(cherry picked from commit 28b996ce73982a44fa86736ca0e3684cb1ae8b24)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
|
Specifying VM during lrc->bo creation requires VM's reference
to be held for the lifetime of lrc->bo as it will use VM's dma
reservation object. Using VM's dma reservation object for
lrc->bo doesn't provide any advantage. Hence do not pass VM
while creating lrc->bo.
v2: Use xe_bo_unpin_map_no_vm (Matthew Brost)
Fixes: 264eecdba211 ("drm/xe: Decouple xe_exec_queue and xe_lrc")
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250529052031.2429120-2-niranjana.vishwanathapura@intel.com
(cherry picked from commit fbeaad071a98fef87deccee81d564de1c8e8e16d)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
|
Daniele noticed that the fix in commit 2d2be279f1ca ("drm/xe: fix UAF
around queue destruction") looks to have been unintentionally removed as
part of handling a conflict in some past merge commit. Add it back.
Fixes: ac44ff7cec33 ("Merge tag 'drm-xe-fixes-2024-10-10' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes")
Reported-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v6.12+
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250603174213.1543579-2-matthew.auld@intel.com
(cherry picked from commit 9d9fca62dc49d96f97045b6d8e7402a95f8cf92a)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
|
The expected flow of operations when using PXP is to query the PXP
status and wait for it to transition to "ready" before attempting to
create an exec_queue. This flow is followed by the Mesa driver, but
there is no guarantee that an incorrectly coded (or malicious) app
will not attempt to create the queue first without querying the status.
Therefore, we need to clarify what the expected behavior of the queue
creation ioctl is in this scenario.
Currently, the ioctl always fails with an -EBUSY code no matter the
error, but for consistency it is better to distinguish between "failed
to init" (-EIO) and "not ready" (-EBUSY), the same way the query ioctl
does. Note that, while this is a change in the return code of an ioctl,
the behavior of the ioctl in this particular corner case was not clearly
spec'd, so no one should have been relying on it (and we know that Mesa,
which is the only known userspace for this, didn't).
v2: Minor rework of the doc (Rodrigo)
Fixes: 72d479601d67 ("drm/xe/pxp/uapi: Add userspace and LRC support for PXP-using queues")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250522225401.3953243-7-daniele.ceraolospurio@intel.com
(cherry picked from commit 21784ca96025b62d95b670b7639ad70ddafa69b8)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
|
The define of the extension type was accidentally used instead of the
one of the property itself. They're both zero, so no functional issue,
but we should use the correct define for code correctness.
Fixes: 41a97c4a1294 ("drm/xe/pxp/uapi: Add API to mark a BO as using PXP")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://lore.kernel.org/r/20250522225401.3953243-6-daniele.ceraolospurio@intel.com
(cherry picked from commit 1d891ee820fd0fbb4101eacb0d922b5050a24933)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
|
Customer is reporting a really subtle issue where we get random DMAR
faults, hangs and other nasties for kernel migration jobs when stressing
stuff like s2idle/s3/s4. The explosions seems to happen somewhere
after resuming the system with splats looking something like:
PM: suspend exit
rfkill: input handler disabled
xe 0000:00:02.0: [drm] GT0: Engine reset: engine_class=bcs, logical_mask: 0x2, guc_id=0
xe 0000:00:02.0: [drm] GT0: Timedout job: seqno=24496, lrc_seqno=24496, guc_id=0, flags=0x13 in no process [-1]
xe 0000:00:02.0: [drm] GT0: Kernel-submitted job timed out
The likely cause appears to be a race between suspend cancelling the
worker that processes the free_job()'s, such that we still have pending
jobs to be freed after the cancel. Following from this, on resume the
pending_list will now contain at least one already complete job, but it
looks like we call drm_sched_resubmit_jobs(), which will then call
run_job() on everything still on the pending_list. But if the job was
already complete, then all the resources tied to the job, like the bb
itself, any memory that is being accessed, the iommu mappings etc. might
be long gone since those are usually tied to the fence signalling.
This scenario can be seen in ftrace when running a slightly modified
xe_pm IGT (kernel was only modified to inject artificial latency into
free_job to make the race easier to hit):
xe_sched_job_run: dev=0000:00:02.0, fence=0xffff888276cc8540, seqno=0, lrc_seqno=0, gt=0, guc_id=0, batch_addr=0x000000146910 ...
xe_exec_queue_stop: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=0, guc_state=0x0, flags=0x13
xe_exec_queue_stop: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=1, guc_state=0x0, flags=0x4
xe_exec_queue_stop: dev=0000:00:02.0, 4:0x1, gt=1, width=1, guc_id=0, guc_state=0x0, flags=0x3
xe_exec_queue_stop: dev=0000:00:02.0, 1:0x1, gt=1, width=1, guc_id=1, guc_state=0x0, flags=0x3
xe_exec_queue_stop: dev=0000:00:02.0, 4:0x1, gt=1, width=1, guc_id=2, guc_state=0x0, flags=0x3
xe_exec_queue_resubmit: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=0, guc_state=0x0, flags=0x13
xe_sched_job_run: dev=0000:00:02.0, fence=0xffff888276cc8540, seqno=0, lrc_seqno=0, gt=0, guc_id=0, batch_addr=0x000000146910 ...
.....
xe_exec_queue_memory_cat_error: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=0, guc_state=0x3, flags=0x13
So the job_run() is clearly triggered twice for the same job, even
though the first must have already signalled to completion during
suspend. We can also see a CAT error after the re-submit.
To prevent this only resubmit jobs on the pending_list that have not yet
signalled.
v2:
- Make sure to re-arm the fence callbacks with sched_start().
v3 (Matt B):
- Stop using drm_sched_resubmit_jobs(), which appears to be deprecated
and just open-code a simple loop such that we skip calling run_job()
on anything already signalled.
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4856
Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: William Tseng <william.tseng@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Link: https://lore.kernel.org/r/20250528113328.289392-2-matthew.auld@intel.com
(cherry picked from commit 38fafa9f392f3110d2de431432d43f4eef99cd1b)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
|
For preempt_fence mode VM's we're rejecting eviction of
shared bos during VM_BIND. However, since we do this in the
move() callback, we're getting an eviction failure warning from
TTM. The TTM callback intended for these things is
eviction_valuable().
However, the latter doesn't pass in the struct ttm_operation_ctx
needed to determine whether the caller needs this.
Instead, attach the needed information to the vm under the
vm->resv, until we've been able to update TTM to provide the
needed information. And add sufficient lockdep checks to prevent
misuse and races.
v2:
- Fix a copy-paste error in xe_vm_clear_validating()
v3:
- Fix kerneldoc errors.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Fixes: 0af944f0e308 ("drm/xe: Reject BO eviction if BO is bound to current VM")
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250528164105.234718-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit 9d5558649f68e2e84a87a909631b30e15ca0f8ec)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
|
The XE driver can be built with or without VSEC support, but fails to link as
built-in if vsec is in a loadable module:
x86_64-linux-ld: vmlinux.o: in function `xe_vsec_init':
(.text+0x1e83e16): undefined reference to `intel_vsec_register'
The normal fix for this is to add a 'depends on INTEL_VSEC || !INTEL_VSEC',
forcing XE to be a loadable module as well, but that causes a circular
dependency:
symbol DRM_XE depends on INTEL_VSEC
symbol INTEL_VSEC depends on X86_PLATFORM_DEVICES
symbol X86_PLATFORM_DEVICES is selected by DRM_XE
The problem here is selecting a symbol from another subsystem, so change
that as well and rephrase the 'select' into the corresponding dependency.
Since X86_PLATFORM_DEVICES is 'default y', there is no change to
defconfig builds here.
Fixes: 0c45e76fcc62 ("drm/xe/vsec: Support BMG devices")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250529172355.2395634-2-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit e4931f8be347ec5f19df4d6d33aea37145378c42)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
|
The result of integer comparison already evaluates to bool. No need for
explicit conversion.
No functional impact.
Fixes: 0e414bf7ad01 ("drm/xe: Expose PCIe link downgrade attributes")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202505292205.MoljmkjQ-lkp@intel.com/
Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20250529160937.490147-1-raag.jadav@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 61761a6b57f2818983466d24aab60baab471ba21)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
|
Move power2/curr2_crit to channel 1 i.e power1/curr1_crit as this
represents the entire card critical power/current.
v2: Update the date of curr1_crit also in hwmon documentation.
Signed-off-by: Karthik Poosa <karthik.poosa@intel.com>
Fixes: 345dadc4f68b ("drm/xe/hwmon: Add infra to support card power and energy attributes")
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Link: https://lore.kernel.org/r/20250529163458.2354509-3-karthik.poosa@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 25e963a09e059ffdb15c09cc79cfded855b43668)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|