Age | Commit message (Collapse) | Author |
|
DDI might be HDMI or DP only, leaving the other encoder
uninitialized. Calling the shutdown hook on an uninitialized encoder may
lead to a NULL pointer dereference. Check the encoder types (and thus
validity via the DP output_reg or HDMI hdmi_reg checks) before calling
the hooks.
Reported-and-tested-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Closes: https://lore.kernel.org/r/20241031105145.2140590-1-senozhatsky@chromium.org
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/8b197c50e7f3be2bbc07e3935b21e919815015d5.1735568047.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Similar to intel_encoder_is_dp() and friends.
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Tested-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/e6bf9e01deb5d0d8b566af128a762d1313638847.1735568047.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Errors from intel_ddi_init_hdmi_connector() can just mean "there's no
HDMI" while we'll still want to continue with DP only. Handle the errors
gracefully, but don't propagate. Clear the hdmi_reg which is used as a
proxy to indicate the HDMI is initialized.
v2: Gracefully handle but do not propagate
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Reported-and-tested-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Closes: https://lore.kernel.org/r/20241031105145.2140590-1-senozhatsky@chromium.org
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> # v1
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/d72cb54ac7cc5ca29b3b9d70e4d368ea41643b08.1735568047.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Handle encoder and connector init failures in g4x_hdmi_init(). This is
similar to g4x_dp_init().
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Reported-and-tested-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Closes: https://lore.kernel.org/r/20241031105145.2140590-1-senozhatsky@chromium.org
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/cafae7bf1f9ffb8f6a1d7a508cd2ce7dcf06fef7.1735568047.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Propagate errors from intel_hdmi_init_connector() to be able to handle
them at callers. This is similar to intel_dp_init_connector().
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Reported-and-tested-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Closes: https://lore.kernel.org/r/20241031105145.2140590-1-senozhatsky@chromium.org
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/cdaf9e32cc4880c46e120933438c37b4d87be12e.1735568047.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
The caller doesn't actually need the returned struct intel_connector;
it's stored in the ->attached_connector of intel_dp and
intel_hdmi. Switch to returning an int with 0 for success and negative
errors codes to be able to indicate success even when we don't have a
connector.
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Tested-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/8ef7fe838231919e85eaead640c51ad3e4550d27.1735568047.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
ipc_mmio_init() used the post-decrement operator in its loop continuing
condition of "retries" counter being "> 0", which meant that when this
condition caused loop exit "retries" counter reached -1.
But the later valid exec stage failure check only tests for "retries"
counter being exactly zero, so it didn't trigger in this case (but
would wrongly trigger if the code reaches a valid exec stage in the
very last loop iteration).
Fix this by using the pre-decrement operator instead, so the loop counter
is exactly zero on valid exec stage failure.
Fixes: dc0514f5d828 ("net: iosm: mmio scratchpad")
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Link: https://patch.msgid.link/8b19125a825f9dcdd81c667c1e5c48ba28d505a6.1735490770.git.mail@maciej.szmigiero.name
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In efx_tc_ct_zone_ht_params, the key_len was previously set to
offsetof(struct efx_tc_ct_zone, linkage). This calculation is incorrect
because it includes any padding between the zone field and the linkage
field due to structure alignment, which can vary between systems.
This patch updates key_len to use sizeof_field(struct efx_tc_ct_zone, zone)
, ensuring that the hash table correctly uses the zone as the key. This fix
prevents potential hash lookup errors and improves connection tracking
reliability.
Fixes: c3bb5c6acd4e ("sfc: functions to register for conntrack zone offload")
Signed-off-by: Liang Jie <liangjie@lixiang.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/20241230093709.3226854-1-buaajxlj@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Driver Changes:
- A couple of OA fixes squashed for stable backporting (Umesh)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Z3bur0RmH6-70YSh@fedora
|
|
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
drm-misc-fixes for v6.13-rc6:
- Only fixes for adv7511 driver, including a use-after-free.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/f58429b7-5f11-4b78-b577-de32b41299ea@linux.intel.com
|
|
https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes
- Fix C10 pll programming sequence [cx0_phy] (Suraj Kandpal)
- Fix power gate sequence. [dg1] (Rodrigo Vivi)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tursulin@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Z2wKf7tmElKFdnoP@linux
|
|
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
UAPI Changes:
- Revert some devcoredump file format changes
breaking a mesa debug tool (John)
Driver Changes:
- Fixes around waits when moving to system (Nirmoy)
- Fix a typo when checking for LMEM provisioning (Michal)
- Fix a fault on fd close after unbind (Lucas)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Z2mjt7OTfH76cgua@fedora
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm
Pull pmdomain fixes from Ulf Hansson:
- Silence warning by adding a dummy release function
- imx: Fix an OF node reference leak in imx_gpcv2_probe()
* tag 'pmdomain-v6.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
pmdomain: core: add dummy release function to genpd device
pmdomain: imx: gpcv2: fix an OF node reference leak in imx_gpcv2_probe()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fix from Ulf Hansson:
- sdhci-msm: Fix crypto key eviction
* tag 'mmc-v6.13-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci-msm: fix crypto key eviction
|
|
This is a set of squashed commits to facilitate smooth applying to
stable. Each commit message is retained for reference.
1) Allow a GGTT mapped batch to be submitted to user exec queue
For a OA use case, one of the HW registers needs to be modified by
submitting an MI_LOAD_REGISTER_IMM command to the users exec queue, so
that the register is modified in the user's hardware context. In order
to do this a batch that is mapped in GGTT, needs to be submitted to the
user exec queue. Since all user submissions use q->vm and hence PPGTT,
add some plumbing to enable submission of batches mapped in GGTT.
v2: ggtt is zero-initialized, so no need to set it false (Matt Brost)
2) xe/oa: Use MI_LOAD_REGISTER_IMMEDIATE to enable OAR/OAC
To enable OAR/OAC, a bit in RING_CONTEXT_CONTROL needs to be set.
Setting this bit cause the context image size to change and if not done
correct, can cause undesired hangs.
Current code uses a separate exec_queue to modify this bit and is
error-prone. As per HW recommendation, submit MI_LOAD_REGISTER_IMM to
the target hardware context to modify the relevant bit.
In v2 version, an attempt to submit everything to the user-queue was
made, but it failed the unprivileged-single-ctx-counters test. It
appears that the OACTXCONTROL must be modified from a remote context.
In v3 version, all context specific register configurations were moved
to use LOAD_REGISTER_IMMEDIATE and that seems to work well. This is a
cleaner way, since we can now submit all configuration to user
exec_queue and the fence handling is simplified.
v2:
(Matt)
- set job->ggtt to true if create job is successful
- unlock vm on job error
(Ashutosh)
- don't wait on job submission
- use kernel exec queue where possible
v3:
(Ashutosh)
- Fix checkpatch issues
- Remove extra spaces/new-lines
- Add Fixes: and Cc: tags
- Reset context control bit when OA stream is closed
- Submit all config via MI_LOAD_REGISTER_IMMEDIATE
(Umesh)
- Update commit message for v3 experiment
- Squash patches for easier port to stable
v4:
(Ashutosh)
- No need to pass q to xe_oa_submit_bb
- Do not support exec queues with width > 1
- Fix disabling of CTX_CTRL_OAC_CONTEXT_ENABLE
v5:
(Ashutosh)
- Drop reg_lri related comments
- Use XE_OA_SUBMIT_NO_DEPS in xe_oa_load_with_lri
Fixes: 8135f1c09dd2 ("drm/xe/oa: Don't reset OAC_CONTEXT_ENABLE on OA stream close")
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com> # commit 1
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241220171919.571528-2-umesh.nerlige.ramappa@intel.com
(cherry picked from commit 55039832f98c7e05f1cf9e0d8c12b2490abd0f16)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
|
|
SM8550/SM8650 SoCs
SM8550 and SM8650 SoCs doesn't support UFS PHY retention. So once these SoCs
reaches the low power state (CX power collapse) during system suspend, all
the PHY hardware state gets lost. This leads to the UFS resume failure:
ufshcd-qcom 1d84000.ufs: ufshcd_uic_hibern8_exit: hibern8 exit failed. ret = 5
ufshcd-qcom 1d84000.ufs: __ufshcd_wl_resume: hibern8 exit failed 5
ufs_device_wlun 0:0:0:49488: ufshcd_wl_resume failed: 5
ufs_device_wlun 0:0:0:49488: PM: dpm_run_callback(): scsi_bus_resume+0x0/0x84 returns 5
ufs_device_wlun 0:0:0:49488: PM: failed to resume async: error 5
With the default system suspend level of UFS_PM_LVL_3, the power domain for
UFS PHY needs to be kept always ON to retain the state. But this would
prevent these SoCs from reaching the CX power collapse state, leading to
poor power saving during system suspend.
So to fix this issue without affecting the power saving, set
'ufs_qcom_drvdata::no_phy_retention' to true which sets 'hba->spm_lvl' to
UFS_PM_LVL_5 to allow both the controller and device (in turn the PHY) to be
powered down during system suspend for these SoCs by default.
Cc: stable@vger.kernel.org # 6.3
Fixes: 35cf1aaab169 ("arm64: dts: qcom: sm8550: Add UFS host controller and phy nodes")
Fixes: 10e024671295 ("arm64: dts: qcom: sm8650: add interconnect dependent device nodes")
Reported-by: Neil Armstrong <neil.armstrong@linaro.org>
Tested-by: Amit Pundir <amit.pundir@linaro.org> # on SM8550-HDK
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20241219-ufs-qcom-suspend-fix-v3-4-63c4b95a70b9@linaro.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
In order to allow platform specific flags and configurations, introduce the
platform specific OF data and move the existing quirk
UFSHCD_QUIRK_BROKEN_LSDBS_CAP for SM8550 and SM8650 SoCs.
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Tested-by: Amit Pundir <amit.pundir@linaro.org> # on SM8550-HDK
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20241219-ufs-qcom-suspend-fix-v3-3-63c4b95a70b9@linaro.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
drivers
Otherwise, the default levels will override the levels set by the host
controller drivers.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20241219-ufs-qcom-suspend-fix-v3-2-63c4b95a70b9@linaro.org
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
ufs_qcom_power_up_sequence()
PHY might already be powered on during ufs_qcom_power_up_sequence() in a
couple of cases:
1. During UFSHCD_QUIRK_REINIT_AFTER_MAX_GEAR_SWITCH quirk
2. Resuming from spm_lvl = 5 suspend
In those cases, it is necessary to call phy_power_off() and phy_exit() in
ufs_qcom_power_up_sequence() function to power off the PHY before calling
phy_init() and phy_power_on().
Case (1) is doing it via ufs_qcom_reinit_notify() callback, but case (2) is
not handled. So to satisfy both cases, call phy_power_off() and phy_exit()
if the phy_count is non-zero. And with this change, the reinit_notify()
callback is no longer needed.
This fixes the below UFS resume failure with spm_lvl = 5:
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: Enabling the controller failed
ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5
ufs_device_wlun 0:0:0:49488: ufshcd_wl_resume failed: -5
ufs_device_wlun 0:0:0:49488: PM: dpm_run_callback(): scsi_bus_resume returns -5
ufs_device_wlun 0:0:0:49488: PM: failed to resume async: error -5
Cc: stable@vger.kernel.org # 6.3
Fixes: baf5ddac90dc ("scsi: ufs: ufs-qcom: Add support for reinitializing the UFS device")
Reported-by: Ram Kumar Dwivedi <quic_rdwivedi@quicinc.com>
Tested-by: Amit Pundir <amit.pundir@linaro.org> # on SM8550-HDK
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20241219-ufs-qcom-suspend-fix-v3-1-63c4b95a70b9@linaro.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/djakov/icc into char-misc-next
Georgi writes:
interconnect fixes for v6.13-rc
This contains two fixes. One fixing a boot error on db410c board when UBSAN
is enabled with clang-19 builds. The other one adds a missing return value
check after devm_kasprintf.
- interconnect: qcom: icc-rpm: Set the count member before accessing the flex array
- interconnect: icc-clk: check return values of devm_kasprintf()
Signed-off-by: Georgi Djakov <djakov@kernel.org>
* tag 'icc-6.13-rc6' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/djakov/icc:
interconnect: icc-clk: check return values of devm_kasprintf()
interconnect: qcom: icc-rpm: Set the count member before accessing the flex array
|
|
The DP calibration data is stored in nvmem cells, and the data layout is
described in the `mtk_dp_efuse_fmt` arrays for each platform.
There is no guarantee that the data is always a 4-length u32 cell array.
For example, MT8188 has a data length of 3, preventing it from passing
the preliminary check and undergoing calibration.
Update the logic to support flexible data lengths. Specifically, we
validate the length returned from `nvmem_cell_read()` against the
platform-specific efuse format. If out-of-bound access is detected, fall
back to the default calibration values. This likely indicates an error
in either the efuse data length described in DT or the efuse format
within the driver.
Signed-off-by: Fei Shao <fshao@chromium.org>
Reviewed-by: CK Hu <ck.hu@mediatek.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20241204142626.158395-1-fshao@chromium.org/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
|
|
cocci warnings: (new ones prefixed by >>)
>> drivers/gpu/drm/mediatek/mtk_drm_drv.c:1092:2-3: Unneeded semicolon
Fixes: 4c932840db1d ("drm/mediatek: Implement OF graphs support for display paths")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202412022048.kY2ZhxZ4-lkp@intel.com/
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20241230135314.5419-1-chunkuang.hu@kernel.org/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
|
|
This reverts commit 1460bb1fef9ccf7390af0d74a15252442fd6effd.
In all places the MI_STORE_DATA_IMM are not followed by a read of
the same memory address in the same batch buffer and the posted writes
are flushed with PIPE_CONTROL or MI_FLUSH_DW in xe_ring_ops.c functions
so there is no need to set this register.
Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Fixes: 1460bb1fef9c ("drm/xe: Force write completion of MI_STORE_DATA_IMM")
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241227183230.101334-1-jose.souza@intel.com
|
|
Pull NVMe fixes from Keith:
"nvme fixes for Linux 6.13
- Fix device specific quirk for PRP list alignment (Robert)
- Fix target name overflow (Leo)
- Fix target write granularity (Luis)
- Fix target sleeping in atomic context (Nilay)
- Remove unnecessary tcp queue teardown (Chunguang)"
* tag 'nvme-6.13-2024-12-31' of git://git.infradead.org/nvme:
nvme-tcp: remove nvme_tcp_destroy_io_queues()
nvmet-loop: avoid using mutex in IO hotpath
nvmet: propagate npwg topology
nvmet: Don't overflow subsysnqn
nvme-pci: 512 byte aligned dma pool segment quirk
|
|
Fixed to return ENXIO from __send_message_basic_sanity()
to indicate that device is in error state. In the case of
ERR_DEVICE_DETACHED state, the driver should not post the
commands to the firmware as it will time out eventually.
Removed bnxt_re_modify_qp() call from bnxt_re_dev_stop()
as it is a no-op.
Fixes: cc5b9b48d447 ("RDMA/bnxt_re: Recover the device when FW error is detected")
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Link: https://patch.msgid.link/20241231025008.2267162-1-kalesh-anakkur.purayil@broadcom.com
Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
Move the declaration of the 'ib_sge list' variable outside the
'always_invalidate' block to ensure it remains accessible for use
throughout the function.
Previously, 'ib_sge list' was declared within the 'always_invalidate'
block, limiting its accessibility, then caused a
'BUG: kernel NULL pointer dereference'[1].
? __die_body.cold+0x19/0x27
? page_fault_oops+0x15a/0x2d0
? search_module_extables+0x19/0x60
? search_bpf_extables+0x5f/0x80
? exc_page_fault+0x7e/0x180
? asm_exc_page_fault+0x26/0x30
? memcpy_orig+0xd5/0x140
rxe_mr_copy+0x1c3/0x200 [rdma_rxe]
? rxe_pool_get_index+0x4b/0x80 [rdma_rxe]
copy_data+0xa5/0x230 [rdma_rxe]
rxe_requester+0xd9b/0xf70 [rdma_rxe]
? finish_task_switch.isra.0+0x99/0x2e0
rxe_sender+0x13/0x40 [rdma_rxe]
do_task+0x68/0x1e0 [rdma_rxe]
process_one_work+0x177/0x330
worker_thread+0x252/0x390
? __pfx_worker_thread+0x10/0x10
This change ensures the variable is available for subsequent operations
that require it.
[1] https://lore.kernel.org/linux-rdma/6a1f3e8f-deb0-49f9-bc69-a9b03ecfcda7@fujitsu.com/
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Link: https://patch.msgid.link/20241231013416.1290920-1-lizhijian@fujitsu.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
When driver processes the internal state change command, it use an
asynchronous thread to process the command operation. If the main
thread detects that the task has timed out, the asynchronous thread
will panic when executing the completion notification because the
main thread completion object has been released.
BUG: unable to handle page fault for address: fffffffffffffff8
PGD 1f283a067 P4D 1f283a067 PUD 1f283c067 PMD 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
RIP: 0010:complete_all+0x3e/0xa0
[...]
Call Trace:
<TASK>
? __die_body+0x68/0xb0
? page_fault_oops+0x379/0x3e0
? exc_page_fault+0x69/0xa0
? asm_exc_page_fault+0x22/0x30
? complete_all+0x3e/0xa0
fsm_main_thread+0xa3/0x9c0 [mtk_t7xx (HASH:1400 5)]
? __pfx_autoremove_wake_function+0x10/0x10
kthread+0xd8/0x110
? __pfx_fsm_main_thread+0x10/0x10 [mtk_t7xx (HASH:1400 5)]
? __pfx_kthread+0x10/0x10
ret_from_fork+0x38/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1b/0x30
</TASK>
[...]
CR2: fffffffffffffff8
---[ end trace 0000000000000000 ]---
Use the reference counter to ensure safe release as Sergey suggests:
https://lore.kernel.org/all/da90f64c-260a-4329-87bf-1f9ff20a5951@gmail.com/
Fixes: 13e920d93e37 ("net: wwan: t7xx: Add core components")
Signed-off-by: Jinjian Song <jinjian.song@fibocom.com>
Acked-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Link: https://patch.msgid.link/20241224041552.8711-1-jinjian.song@fibocom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A Marvell 88E8075 ethernet controller has this device ID instead of
11ab:4370 and works fine with the sky2 driver.
Signed-off-by: Pascal Hambourg <pascal@plouf.fr.eu.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/10165a62-99fb-4be6-8c64-84afd6234085@plouf.fr.eu.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Current implementation of mv643xx_eth_shared_of_add_port() calls
of_parse_phandle(), but does not release the refcount on error. Call
of_node_put() in the error path and in mv643xx_eth_shared_of_remove().
This bug was found by an experimental verification tool that I am
developing.
Fixes: 76723bca2802 ("net: mv643xx_eth: add DT parsing support")
Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Link: https://patch.msgid.link/20241221081448.3313163-1-joe@pf.is.s.u-tokyo.ac.jp
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit ba0925c34e0f ("gve: process XSK TX descriptors as part of RX NAPI")
moved XSK TX processing to be part of the RX NAPI. However, that commit
did not include triggering the RX NAPI in gve_xsk_wakeup. This is
necessary because the TX NAPI only processes TX completions, meaning
that a TX wakeup would not actually trigger XSK descriptor processing.
Also, the branch on XDP_WAKEUP_TX was supposed to have been removed, as
the NAPI should be scheduled whether the wakeup is for RX or TX.
Fixes: ba0925c34e0f ("gve: process XSK TX descriptors as part of RX NAPI")
Cc: stable@vger.kernel.org
Signed-off-by: Joshua Washington <joshwash@google.com>
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Link: https://patch.msgid.link/20241221032807.302244-1-pkaligineedi@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Check the return value of clk_prepare_enable to ensure that priv->clk has
been successfully enabled.
If priv->clk was not enabled during bcm_sysport_probe, bcm_sysport_resume,
or bcm_sysport_open, it must not be disabled in any subsequent execution
paths.
Fixes: 31bc72d97656 ("net: systemport: fetch and use clock resources")
Signed-off-by: Vitalii Mordan <mordan@ispras.ru>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20241227123007.2333397-1-mordan@ispras.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix typo in cd_dbg line to add trailing newline character.
Signed-off-by: Steven Davis <goldside000@outlook.com>
Link: https://lore.kernel.org/lkml/20241229165744.21725-1-goldside000@outlook.com
Reviewed-by: Phillip Potter <phil@philpotter.co.uk>
Link: https://lore.kernel.org/lkml/Z3GV2W_MUOw5BrtR@equinox
Signed-off-by: Phillip Potter <phil@philpotter.co.uk>
Link: https://lore.kernel.org/r/20241230193431.441120-2-phil@philpotter.co.uk
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform drivers fixes from Ilpo Järvinen:
"hp-wmi:
- mark 8A15 board for timed OMEN thermal profile
mlx-platform:
- call pci_dev_put() to balance the refcount
thinkpad-acpi:
- Add support for hotkey 0x1401"
* tag 'platform-drivers-x86-v6.13-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: thinkpad-acpi: Add support for hotkey 0x1401
platform/x86: hp-wmi: mark 8A15 board for timed OMEN thermal profile
platform/x86: mlx-platform: call pci_dev_put() to balance the refcount
|
|
Registers DSI_VM_CMD and DSI_SHADOW_DEBUG start at different
addresses in both MT8186 and MT8188 compared to the older IPs.
Add two members in struct mtk_dsi_driver_data to specify the
offsets for these two registers on a per-SoC basis, then do
specify those in all of the currently present SoC driver data.
This fixes writes to the Video Mode Command Packet Control
register, fixing enablement of command packet transmission
(VM_CMD_EN) and allowance of this transmission during the
VFP period (TS_VFP_EN) on both MT8186 and MT8188.
Fixes: 03d7adc41027 ("drm/mediatek: Add mt8186 dsi compatible to mtk_dsi.c")
Fixes: 814d5341f314 ("drm/mediatek: Add mt8188 dsi compatible to mtk_dsi.c")
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: CK Hu <ck.hu@mediatek.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20241219112733.47907-1-angelogioacchino.delregno@collabora.com/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
|
|
The commit 3ab1663af6c1 ("watchdog: stm32_iwdg: Add pretimeout support")
introduces the support for the pre-timeout interrupt.
The support for this interrupt is optional but the driver uses the
platform_get_irq() which produces an error message during the driver
probe if we don't have any `interrupts` property in the DT.
Use the platform_get_irq_optional() API to get rid of the error message
as this property is optional.
Fixes: 3ab1663af6c1 ("watchdog: stm32_iwdg: Add pretimeout support")
Signed-off-by: Clément Le Goffic <clement.legoffic@foss.st.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20241218092227.771133-1-clement.legoffic@foss.st.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
|
|
Fix dp mode valid issue to avoid abnormal display of limit state.
After DP passes link training, it can express the lane count of the
current link status is good. Calculate the maximum bandwidth supported
by DP using the current lane count.
The color format will select the best one based on the bandwidth
requirements of the current timing mode. If the current timing mode
uses RGB and meets the DP link bandwidth requirements, RGB will be used.
If the timing mode uses RGB but does not meet the DP link bandwidthi
requirements, it will continue to check whether YUV422 meets
the DP link bandwidth.
FEC overhead is approximately 2.4% from DP 1.4a spec 2.2.1.4.2.
The down-spread amplitude shall either be disabled (0.0%) or up
to 0.5% from 1.4a 3.5.2.6. Add up to approximately 3% total overhead.
Because rate is already divided by 10,
mode->clock does not need to be multiplied by 10.
Fixes: f70ac097a2cf ("drm/mediatek: Add MT8195 Embedded DisplayPort driver")
Signed-off-by: Liankun Yang <liankun.yang@mediatek.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20241025083036.8829-3-liankun.yang@mediatek.com/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
|
|
Setting up misc0 for Pixel Encoding Format.
According to the definition of YCbCr in spec 1.2a Table 2-96,
0x1 << 1 should be written to the register.
Use switch case to distinguish RGB, YCbCr422,
and unsupported color formats.
Fixes: f70ac097a2cf ("drm/mediatek: Add MT8195 Embedded DisplayPort driver")
Signed-off-by: Liankun Yang <liankun.yang@mediatek.com>
Link: https://patchwork.kernel.org/project/dri-devel/patch/20241025083036.8829-2-liankun.yang@mediatek.com/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
|
|
This reverts commit fd620fc25d88a1e490eaa9f72bc31962be1b4741.
Boot failures reported by
KernelCI:
[ 4.395400] mediatek-drm mediatek-drm.5.auto: bound 1c014000.merge (ops 0xffffd35fd12975f8)
[ 4.396155] mediatek-drm mediatek-drm.5.auto: bound 1c000000.ovl (ops 0xffffd35fd12977b8)
[ 4.411951] mediatek-drm mediatek-drm.5.auto: bound 1c002000.rdma (ops 0xffffd35fd12989c0)
[ 4.536837] mediatek-drm mediatek-drm.5.auto: bound 1c004000.ccorr (ops 0xffffd35fd1296cf0)
[ 4.545181] mediatek-drm mediatek-drm.5.auto: bound 1c005000.aal (ops 0xffffd35fd1296a80)
[ 4.553344] mediatek-drm mediatek-drm.5.auto: bound 1c006000.gamma (ops 0xffffd35fd12972b0)
[ 4.561680] mediatek-drm mediatek-drm.5.auto: bound 1c014000.merge (ops 0xffffd35fd12975f8)
[ 4.570025] ------------[ cut here ]------------
[ 4.574630] refcount_t: underflow; use-after-free.
[ 4.579416] WARNING: CPU: 6 PID: 81 at lib/refcount.c:28 refcount_warn_saturate+0xf4/0x148
[ 4.587670] Modules linked in:
[ 4.590714] CPU: 6 UID: 0 PID: 81 Comm: kworker/u32:3 Tainted: G W 6.12.0 #1 cab58e2e59020ebd4be8ada89a65f465a316c742
[ 4.602695] Tainted: [W]=WARN
[ 4.605649] Hardware name: Acer Tomato (rev2) board (DT)
[ 4.610947] Workqueue: events_unbound deferred_probe_work_func
[ 4.616768] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 4.623715] pc : refcount_warn_saturate+0xf4/0x148
[ 4.628493] lr : refcount_warn_saturate+0xf4/0x148
[ 4.633270] sp : ffff8000807639c0
[ 4.636571] x29: ffff8000807639c0 x28: ffff34ff4116c640 x27: ffff34ff4368e080
[ 4.643693] x26: ffffd35fd1299ac8 x25: ffff34ff46c8c410 x24: 0000000000000000
[ 4.650814] x23: ffff34ff4368e080 x22: 00000000fffffdfb x21: 0000000000000002
[ 4.657934] x20: ffff34ff470c6000 x19: ffff34ff410c7c10 x18: 0000000000000006
[ 4.665055] x17: 666678302073706f x16: 2820656772656d2e x15: ffff800080763440
[ 4.672176] x14: 0000000000000000 x13: 2e656572662d7265 x12: ffffd35fd2ed14f0
[ 4.679297] x11: 0000000000000001 x10: 0000000000000001 x9 : ffffd35fd0342150
[ 4.686418] x8 : c0000000ffffdfff x7 : ffffd35fd2e21450 x6 : 00000000000affa8
[ 4.693539] x5 : ffffd35fd2ed1498 x4 : 0000000000000000 x3 : 0000000000000000
[ 4.700660] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff34ff40932580
[ 4.707781] Call trace:
[ 4.710216] refcount_warn_saturate+0xf4/0x148 (P)
[ 4.714993] refcount_warn_saturate+0xf4/0x148 (L)
[ 4.719772] kobject_put+0x110/0x118
[ 4.723335] put_device+0x1c/0x38
[ 4.726638] mtk_drm_bind+0x294/0x5c0
[ 4.730289] try_to_bring_up_aggregate_device+0x16c/0x1e0
[ 4.735673] __component_add+0xbc/0x1c0
[ 4.739495] component_add+0x1c/0x30
[ 4.743058] mtk_disp_rdma_probe+0x140/0x210
[ 4.747314] platform_probe+0x70/0xd0
[ 4.750964] really_probe+0xc4/0x2a8
[ 4.754527] __driver_probe_device+0x80/0x140
[ 4.758870] driver_probe_device+0x44/0x120
[ 4.763040] __device_attach_driver+0xc0/0x108
[ 4.767470] bus_for_each_drv+0x8c/0xf0
[ 4.771294] __device_attach+0xa4/0x198
[ 4.775117] device_initial_probe+0x1c/0x30
[ 4.779286] bus_probe_device+0xb4/0xc0
[ 4.783109] deferred_probe_work_func+0xb0/0x100
[ 4.787714] process_one_work+0x18c/0x420
[ 4.791712] worker_thread+0x30c/0x418
[ 4.795449] kthread+0x128/0x138
[ 4.798665] ret_from_fork+0x10/0x20
[ 4.802229] ---[ end trace 0000000000000000 ]---
Fixes: fd620fc25d88 ("drm/mediatek: Switch to for_each_child_of_node_scoped()")
Cc: stable@vger.kernel.org
Cc: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Reported-by: Sasha Levin <sashal@kernel.org>
Closes: https://lore.kernel.org/lkml/Z0lNHdwQ3rODHQ2c@sashalap/T/#mfaa6343cfd4d59aae5912b095c0693c0553e746c
Link: https://patchwork.kernel.org/project/dri-devel/patch/20241223151218.7958-1-chunkuang.hu@kernel.org/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
|
|
The last use of intel_ring_cacheline_align() was removed in 2017 by
commit afa8ce5b3080 ("drm/i915: Nuke legacy flip queueing code")
Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Tvrtko Ursulin <tursulin@igalia.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241227113754.25871-3-tursulin@igalia.com
|
|
intel_huc_suspend() was added in 2022 by
commit 27536e03271d ("drm/i915/huc: track delayed HuC load with a
fence")
but hasn't been used.
Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Tvrtko Ursulin <tursulin@igalia.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241227113754.25871-2-tursulin@igalia.com
|
|
i915_active_acquire_for_context() was added in 2020 by
commit 5d9341370f57 ("drm/i915: Export a preallocate variant of
i915_active_acquire()") but has never been used.
The last use of __i915_gem_object_is_lmem() was removed in 2021 by
commit ff20afc4cee7 ("drm/i915: Update error capture code to avoid using
the current vma state")
Remove them.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Tvrtko Ursulin <tursulin@igalia.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241227113754.25871-1-tursulin@igalia.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
- Fix a hang in the "kernel IBT no ENDBR" self-test that may trigger
on FRED systems, caused by incomplete FRED state cleanup in the
#CP fault handler
- Improve TDX (Coco VM) guest unrecoverable error handling to not
potentially leak decrypted memory
* tag 'x86-urgent-2024-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
virt: tdx-guest: Just leak decrypted memory on unrecoverable errors
x86/fred: Clear WFE in missing-ENDBRANCH #CPs
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fix from Ingo Molnar:
"Fix bogus MSI IRQ setup warning on RISC-V"
* tag 'irq-urgent-2024-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
PCI/MSI: Handle lack of irqdomain gracefully
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
- IMX: fix stop condition in single master mode and add compatible
string for errata adherence
- Microchip: Add support for proper repeated sends and fix unnecessary
NAKs on empty messages, which caused false bus detection
* tag 'i2c-for-6.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: microchip-core: fix "ghost" detections
i2c: microchip-core: actually use repeated sends
i2c: imx: add imx7d compatible string for applying erratum ERR007805
i2c: imx: fix missing stop condition in single-master mode
|
|
F8 mode key on Lenovo 2025 platforms use a different key code.
Adding support for the new keycode 0x1401.
Tested on X1 Carbon Gen 13 and X1 2-in-1 Gen 10.
Signed-off-by: Vishnu Sankar <vishnuocv@gmail.com>
Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Link: https://lore.kernel.org/r/20241227231840.21334-1-vishnuocv@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
The HP OMEN 8 (2022), corresponding to a board ID of 8A15, supports OMEN
thermal profile and requires the timed profile quirk.
Upon adding this ID to both the omen_thermal_profile_boards and
omen_timed_thermal_profile_boards, significant bump in performance can be
observed. For instance, SilverBench (https://silver.urih.com/) results
improved from ~56,000 to ~69,000, as a result of higher power draws (and
thus core frequencies) whilst under load:
Package Power:
- Before the patch: ~65W (dropping to about 55W under sustained load).
- After the patch: ~115W (dropping to about 105W under sustained load).
Core Power:
- Before: ~60W (ditto above).
- After: ~108W (ditto above).
Add 8A15 to omen_thermal_profile_boards and
omen_timed_thermal_profile_boards to improve performance.
Signed-off-by: Xi Xiao <1577912515@qq.com>
Signed-off-by: Mingcong Bai <jeffbai@aosc.io>
Link: https://lore.kernel.org/r/20241226062207.3352629-1-jeffbai@aosc.io
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
In CoCo VMs it is possible for the untrusted host to cause
set_memory_decrypted() to fail such that an error is returned
and the resulting memory is shared. Callers need to take care
to handle these errors to avoid returning decrypted (shared)
memory to the page allocator, which could lead to functional
or security issues.
Leak the decrypted memory when set_memory_decrypted() fails,
and don't need to print an error since set_memory_decrypted()
will call WARN_ONCE().
Fixes: f4738f56d1dc ("virt: tdx-guest: Add Quote generation support using TSM_REPORTS")
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20240619111801.25630-1-lirongqing%40baidu.com
|
|
Pull block fix from Jens Axboe:
"Just a single fix for ublk setup error handling"
* tag 'block-6.13-20241228' of git://git.kernel.dk/linux:
ublk: detach gendisk from ublk device if add_disk() fails
|
|
Now when destroying the IO queue we call nvme_tcp_stop_io_queues()
twice, nvme_tcp_destroy_io_queues() has an unnecessary call. Here we
try to remove nvme_tcp_destroy_io_queues() and merge it into
nvme_tcp_teardown_io_queues(), simplify the code and align with
nvme-rdma, make it easy to maintaince.
Signed-off-by: Chunguang.xu <chunguang.xu@shopee.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
Using mutex lock in IO hot path causes the kernel BUG sleeping while
atomic. Shinichiro[1], first encountered this issue while running blktest
nvme/052 shown below:
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:585
in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 996, name: (udev-worker)
preempt_count: 0, expected: 0
RCU nest depth: 1, expected: 0
2 locks held by (udev-worker)/996:
#0: ffff8881004570c8 (mapping.invalidate_lock){.+.+}-{3:3}, at: page_cache_ra_unbounded+0x155/0x5c0
#1: ffffffff8607eaa0 (rcu_read_lock){....}-{1:2}, at: blk_mq_flush_plug_list+0xa75/0x1950
CPU: 2 UID: 0 PID: 996 Comm: (udev-worker) Not tainted 6.12.0-rc3+ #339
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x6a/0x90
__might_resched.cold+0x1f7/0x23d
? __pfx___might_resched+0x10/0x10
? vsnprintf+0xdeb/0x18f0
__mutex_lock+0xf4/0x1220
? nvmet_subsys_nsid_exists+0xb9/0x150 [nvmet]
? __pfx_vsnprintf+0x10/0x10
? __pfx___mutex_lock+0x10/0x10
? snprintf+0xa5/0xe0
? xas_load+0x1ce/0x3f0
? nvmet_subsys_nsid_exists+0xb9/0x150 [nvmet]
nvmet_subsys_nsid_exists+0xb9/0x150 [nvmet]
? __pfx_nvmet_subsys_nsid_exists+0x10/0x10 [nvmet]
nvmet_req_find_ns+0x24e/0x300 [nvmet]
nvmet_req_init+0x694/0xd40 [nvmet]
? blk_mq_start_request+0x11c/0x750
? nvme_setup_cmd+0x369/0x990 [nvme_core]
nvme_loop_queue_rq+0x2a7/0x7a0 [nvme_loop]
? __pfx___lock_acquire+0x10/0x10
? __pfx_nvme_loop_queue_rq+0x10/0x10 [nvme_loop]
__blk_mq_issue_directly+0xe2/0x1d0
? __pfx___blk_mq_issue_directly+0x10/0x10
? blk_mq_request_issue_directly+0xc2/0x140
blk_mq_plug_issue_direct+0x13f/0x630
? lock_acquire+0x2d/0xc0
? blk_mq_flush_plug_list+0xa75/0x1950
blk_mq_flush_plug_list+0xa9d/0x1950
? __pfx_blk_mq_flush_plug_list+0x10/0x10
? __pfx_mpage_readahead+0x10/0x10
__blk_flush_plug+0x278/0x4d0
? __pfx___blk_flush_plug+0x10/0x10
? lock_release+0x460/0x7a0
blk_finish_plug+0x4e/0x90
read_pages+0x51b/0xbc0
? __pfx_read_pages+0x10/0x10
? lock_release+0x460/0x7a0
page_cache_ra_unbounded+0x326/0x5c0
force_page_cache_ra+0x1ea/0x2f0
filemap_get_pages+0x59e/0x17b0
? __pfx_filemap_get_pages+0x10/0x10
? lock_is_held_type+0xd5/0x130
? __pfx___might_resched+0x10/0x10
? find_held_lock+0x2d/0x110
filemap_read+0x317/0xb70
? up_write+0x1ba/0x510
? __pfx_filemap_read+0x10/0x10
? inode_security+0x54/0xf0
? selinux_file_permission+0x36d/0x420
blkdev_read_iter+0x143/0x3b0
vfs_read+0x6ac/0xa20
? __pfx_vfs_read+0x10/0x10
? __pfx_vm_mmap_pgoff+0x10/0x10
? __pfx___seccomp_filter+0x10/0x10
ksys_read+0xf7/0x1d0
? __pfx_ksys_read+0x10/0x10
do_syscall_64+0x93/0x180
? lockdep_hardirqs_on_prepare+0x16d/0x400
? do_syscall_64+0x9f/0x180
? lockdep_hardirqs_on+0x78/0x100
? do_syscall_64+0x9f/0x180
? lockdep_hardirqs_on_prepare+0x16d/0x400
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f565bd1ce11
Code: 00 48 8b 15 09 90 0d 00 f7 d8 64 89 02 b8 ff ff ff ff eb bd e8 d0 ad 01 00 f3 0f 1e fa 80 3d 35 12 0e 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 4f c3 66 0f 1f 44 00 00 55 48 89 e5 48 83 ec
RSP: 002b:00007ffd6e7a20c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000001000 RCX: 00007f565bd1ce11
RDX: 0000000000001000 RSI: 00007f565babb000 RDI: 0000000000000014
RBP: 00007ffd6e7a2130 R08: 00000000ffffffff R09: 0000000000000000
R10: 0000556000bfa610 R11: 0000000000000246 R12: 000000003ffff000
R13: 0000556000bfa5b0 R14: 0000000000000e00 R15: 0000556000c07328
</TASK>
Apparently, the above issue is caused due to using mutex lock while
we're in IO hot path. It's a regression caused with commit 505363957fad
("nvmet: fix nvme status code when namespace is disabled"). The mutex
->su_mutex is used to find whether a disabled nsid exists in the config
group or not. This is to differentiate between a nsid that is disabled
vs non-existent.
To mitigate the above issue, we've worked upon a fix[2] where we now
insert nsid in subsys Xarray as soon as it's created under config group
and later when that nsid is enabled, we add an Xarray mark on it and set
ns->enabled to true. The Xarray mark is useful while we need to loop
through all enabled namepsaces under a subsystem using xa_for_each_marked()
API. If later a nsid is disabled then we clear Xarray mark from it and also
set ns->enabled to false. It's only when nsid is deleted from the config
group we delete it from the Xarray.
So with this change, now we could easily differentiate a nsid is disabled
(i.e. Xarray entry for ns exists but ns->enabled is set to false) vs non-
existent (i.e.Xarray entry for ns doesn't exist).
Link: https://lore.kernel.org/linux-nvme/20241022070252.GA11389@lst.de/ [2]
Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Closes: https://lore.kernel.org/linux-nvme/tqcy3sveity7p56v7ywp7ssyviwcb3w4623cnxj3knoobfcanq@yxgt2mjkbkam/ [1]
Fixes: 505363957fad ("nvmet: fix nvme status code when namespace is disabled")
Fix-suggested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|