Age | Commit message (Collapse) | Author |
|
When mb-xdp is set and return is XDP_PASS, packet is converted from
xdp_buff to sk_buff with xdp_update_skb_shared_info() in
bnxt_xdp_build_skb().
bnxt_xdp_build_skb() passes incorrect truesize argument to
xdp_update_skb_shared_info().
The truesize is calculated as BNXT_RX_PAGE_SIZE * sinfo->nr_frags but
the skb_shared_info was wiped by napi_build_skb() before.
So it stores sinfo->nr_frags before bnxt_xdp_build_skb() and use it
instead of getting skb_shared_info from xdp_get_shared_info_from_buff().
Splat looks like:
------------[ cut here ]------------
WARNING: CPU: 2 PID: 0 at net/core/skbuff.c:6072 skb_try_coalesce+0x504/0x590
Modules linked in: xt_nat xt_tcpudp veth af_packet xt_conntrack nft_chain_nat xt_MASQUERADE nf_conntrack_netlink xfrm_user xt_addrtype nft_coms
CPU: 2 UID: 0 PID: 0 Comm: swapper/2 Not tainted 6.14.0-rc2+ #3
RIP: 0010:skb_try_coalesce+0x504/0x590
Code: 4b fd ff ff 49 8b 34 24 40 80 e6 40 0f 84 3d fd ff ff 49 8b 74 24 48 40 f6 c6 01 0f 84 2e fd ff ff 48 8d 4e ff e9 25 fd ff ff <0f> 0b e99
RSP: 0018:ffffb62c4120caa8 EFLAGS: 00010287
RAX: 0000000000000003 RBX: ffffb62c4120cb14 RCX: 0000000000000ec0
RDX: 0000000000001000 RSI: ffffa06e5d7dc000 RDI: 0000000000000003
RBP: ffffa06e5d7ddec0 R08: ffffa06e6120a800 R09: ffffa06e7a119900
R10: 0000000000002310 R11: ffffa06e5d7dcec0 R12: ffffe4360575f740
R13: ffffe43600000000 R14: 0000000000000002 R15: 0000000000000002
FS: 0000000000000000(0000) GS:ffffa0755f700000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f147b76b0f8 CR3: 00000001615d4000 CR4: 00000000007506f0
PKRU: 55555554
Call Trace:
<IRQ>
? __warn+0x84/0x130
? skb_try_coalesce+0x504/0x590
? report_bug+0x18a/0x1a0
? handle_bug+0x53/0x90
? exc_invalid_op+0x14/0x70
? asm_exc_invalid_op+0x16/0x20
? skb_try_coalesce+0x504/0x590
inet_frag_reasm_finish+0x11f/0x2e0
ip_defrag+0x37a/0x900
ip_local_deliver+0x51/0x120
ip_sublist_rcv_finish+0x64/0x70
ip_sublist_rcv+0x179/0x210
ip_list_rcv+0xf9/0x130
How to reproduce:
<Node A>
ip link set $interface1 xdp obj xdp_pass.o
ip link set $interface1 mtu 9000 up
ip a a 10.0.0.1/24 dev $interface1
<Node B>
ip link set $interfac2 mtu 9000 up
ip a a 10.0.0.2/24 dev $interface2
ping 10.0.0.1 -s 65000
Following ping.py patch adds xdp-mb-pass case. so ping.py is going to be
able to reproduce this issue.
Fixes: 1dc4c557bfed ("bnxt: adding bnxt_xdp_build_skb to build skb from multibuffer xdp_buff")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Link: https://patch.msgid.link/20250309134219.91670-2-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
usb_control_msg() returns the number of transferred bytes or a negative
error code. The current implementation propagates the transferred byte
count, which is unintended. This affects code paths that assume a
boolean success/failure check, such as the EEPROM detection logic.
Fix this by ensuring lan78xx_read_reg() and lan78xx_write_reg() return
only 0 on success and preserve negative error codes.
This approach is consistent with existing usage, as the transferred byte
count is not explicitly checked elsewhere.
Fixes: 8b1b2ca83b20 ("net: usb: lan78xx: Improve error handling in EEPROM and OTP operations")
Reported-by: Mark Brown <broonie@kernel.org>
Closes: https://lore.kernel.org/all/ac965de8-f320-430f-80f6-b16f4e1ba06d@sirena.org.uk
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Tested-by: Mark Brown <broonie@kernel.org>
Link: https://patch.msgid.link/20250307101223.3025632-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In mlx5_chains_create_table(), the return value of mlx5_get_fdb_sub_ns()
and mlx5_get_flow_namespace() must be checked to prevent NULL pointer
dereferences. If either function fails, the function should log error
message with mlx5_core_warn() and return error pointer.
Fixes: 39ac237ce009 ("net/mlx5: E-Switch, Refactor chains and priorities")
Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250307021820.2646-1-vulab@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Need to use the correct IP block type. VCE vs VCN.
Fixes mclk issues on Hawaii.
Suggested by selendym.
Fixes: 82ae6619a450 ("drm/amdgpu: update the handle ptr in wait_for_idle")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3997
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Cc: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 02438acd252395628d74cfac692efbb676d21521)
Cc: stable@vger.kernel.org
|
|
[Why]
A slab-use-after-free is reported when HDCP is destroyed but the
property_validate_dwork queue is still running.
[How]
Cancel the delayed work when destroying workqueue.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4006
Fixes: da3fd7ac0bcf ("drm/amd/display: Update CP property based on HW query")
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 725a04ba5a95e89c89633d4322430cfbca7ce128)
Cc: stable@vger.kernel.org
|
|
[WHY & HOW]
A warning message "WARNING: CPU: 4 PID: 459 at ... /dc_resource.c:3397
calculate_phy_pix_clks+0xef/0x100 [amdgpu]" occurs because the
display_color_depth == COLOR_DEPTH_141414 is not handled. This is
observed in Radeon RX 6600 XT.
It is fixed by assigning pix_clk * (14 * 3) / 24 - same as the rests.
Also fixes the indentation in get_norm_pix_clk.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 274a87eb389f58eddcbc5659ab0b180b37e92775)
Cc: stable@vger.kernel.org
|
|
[Why]
GPU reset will attempt to restore cached state, but brightness doesn't
get restored. It will come back at 100% brightness, but userspace thinks
it's the previous value.
[How]
When running resume sequence if GPU is in reset restore brightness
to previous value.
Acked-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5e19e2b57b6bb640d68dfc7991e1e182922cf867)
Cc: stable@vger.kernel.org
|
|
[Why]
To avoid flickering during boot default brightness level set by BIOS
should be maintained for as much of the boot as feasible.
commit 2fe87f54abdc ("drm/amd/display: Set default brightness according
to ACPI") attempted to set the right levels for AC vs DC, but brightness
still got reset to maximum level in initialization code for
setup_backlight_device().
[How]
Remove the hardcoded initialization in setup_backlight_device() and
instead program brightness value to match BIOS (AC or DC). This avoids a
brightness flicker from kernel changing the value. Userspace may however
still change it during boot.
Fixes: 2fe87f54abdc ("drm/amd/display: Set default brightness according to ACPI")
Acked-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 0747acf3311229e22009bec4a9e7fc30c879e842)
Cc: stable@vger.kernel.org
|
|
[Why]
It seems HPD interrupts are enabled by default for all connectors, even
if the hpd source isn't valid. An eDP for example, does not have a valid
hpd source (but does have a valid hpdrx source; see construct_phy()).
Thus, eDPs should have their hpd interrupt disabled.
In the past, this wasn't really an issue. Although the driver gets
interrupted, then acks by writing to hw registers, there weren't any
subscribed handlers that did anything meaningful (see
register_hpd_handlers()).
But things changed with the introduction of IPS. s2idle requires that
the driver allows IPS for DMUB fw to put hw to sleep. Since register
access requires hw to be awake, the driver will block IPS entry to do
so. And no IPS means no hw sleep during s2idle.
This was the observation on DCN35 systems with an eDP. During suspend,
the eDP toggled its hpd pin as part of the panel power down sequence.
The driver was then interrupted, and acked by writing to registers,
blocking IPS entry.
[How]
Since DC marks eDP connections as having invalid hpd sources (see
construct_phy()), DM should disable them at the hw level. Do so in
amdgpu_dm_hpd_init() by disabling all hpd ints first, then selectively
enabling ones for connectors that have valid hpd sources.
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7b1ba19eb15f88e70782642ce2d934211269337b)
Cc: stable@vger.kernel.org
|
|
When userspace invokes S4 the flow is:
1) amdgpu_pmops_prepare()
2) amdgpu_pmops_freeze()
3) Create hibernation image
4) amdgpu_pmops_thaw()
5) Write out image to disk
6) Turn off system
Then on resume amdgpu_pmops_restore() is called.
This flow has a problem that because amdgpu_pmops_thaw() is called
it will call amdgpu_device_resume() which will resume all of the GPU.
This includes turning the display hardware back on and discovering
connectors again.
This is an unexpected experience for the display to turn back on.
Adjust the flow so that during the S4 sequence display hardware is
not turned back on.
Reported-by: Xaver Hugl <xaver.hugl@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2038
Cc: Muhammad Usama Anjum <usama.anjum@collabora.com>
Tested-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Link: https://lore.kernel.org/r/20250306185124.44780-1-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 68bfdc8dc0a1a7fdd9ab61e69907ae71a6fd3d91)
|
|
Starting from 6.11, AMDGPU driver, while being loaded with amdgpu.dc=1,
due to lack of .is_two_pixels_per_container function in dce60_tg_funcs,
causes a NULL pointer dereference on PCs with old GPUs, such as R9 280X.
So this fix adds missing .is_two_pixels_per_container to dce60_tg_funcs.
Reported-by: Rosen Penev <rosenp@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3942
Fixes: e6a901a00822 ("drm/amd/display: use even ODM slice width for two pixels per container")
Signed-off-by: Aliaksei Urbanski <aliaksei.urbanski@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit bd4b125eb949785c6f8a53b0494e32795421209d)
Cc: stable@vger.kernel.org
|
|
We advertise DCC as supported for NV12/P010 formats on GFX12,
but it would fail on this check on atomic commit.
Signed-off-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ba795235a2b99ba9bbef647ab003b2f3145d9bbb)
Cc: stable@vger.kernel.org # 6.12.x
|
|
The check for args->extensions is repeated twice in xe_vm_create_ioctl().
This commit removes the redundant check to streamline the code.
Fixes: 7224788f6756 ("drm/xe: Kill XE_VM_PROPERTY_BIND_OP_ERROR_CAPTURE_ADDRESS extension")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Xin Wang <x.wang@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250303004942.951699-1-x.wang@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 8da8aecf1f2d89c2b8188bcf7aa252ec146ddd12)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Raptor Lake-S SoCs notify correctable memory errors via CMCI (Corrected
Machine Check Interrupt). Switch Raptor Lake-S EDAC support from polling
to interrupt mode by registering the callback to the MCE decode notifier
chain.
Note that as Raptor Lake-S SoCs may not recover from uncorrectable memory
errors, the system will hang as soon as this type of error occurs, and the
registered callback on the MCE decode chain will not be executed. This is
the expected behavior.
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Tested-by: Gary Wang <gary.c.wang@intel.com>
Link: https://lore.kernel.org/r/20250310011411.31685-12-qiuxu.zhuo@intel.com
|
|
The Intel Raptor Lake-S SoC contains two memory controllers with DDR5
memory type and out-of-band ECC capability. The resource definitions of
the memory controller are different from previous generations. One notable
difference is that the PCI ERRSTS register is deprecated and is not used
to indicate the presence of errors or to clear the MMIO-mapped ECC error
log regsiters.
Extend the ie31200_edac driver to support multiple memory controllers,
add a resource configuration table and use an MSR register to clear the
ECC error log registers to provide EDAC support for Raptor Lake-S SoCs.
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Tested-by: Gary Wang <gary.c.wang@intel.com>
Link: https://lore.kernel.org/r/20250310011411.31685-11-qiuxu.zhuo@intel.com
|
|
Split ie31200_probe1() into two helper functions to easily extend support
for multiple memory controllers.
No functional changes intended.
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Tested-by: Gary Wang <gary.c.wang@intel.com>
Link: https://lore.kernel.org/r/20250310011411.31685-10-qiuxu.zhuo@intel.com
|
|
Fold the two channel loops to simplify the code and improve readability.
Also, delete the comments related to the DRB register, as this register
is not used here.
No functional changes intended.
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Tested-by: Gary Wang <gary.c.wang@intel.com>
Link: https://lore.kernel.org/r/20250310011411.31685-9-qiuxu.zhuo@intel.com
|
|
The current dimm_data structure contains encoded DIMM information,
which needs to be decoded for a given SoC when it is used. Make it
contain decoded information when it's initialized so that the places
where it is used do not need to decode it again, thereby simplifying
the code.
No functional changes intended.
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Tested-by: Gary Wang <gary.c.wang@intel.com>
Link: https://lore.kernel.org/r/20250310011411.31685-8-qiuxu.zhuo@intel.com
|
|
The resources such as MMIO, register offset, register mask, memory DIMM
information, ECC error log location, etc., of the memory controller, and
the number of memory controllers can be device-ID-specific. It requires
adding numerous 'if (device_id == new_id)' special handling cases to the
code to support a new SoC.
Make these kinds of resources configurable and separate them from the code
to facilitate the addition of new SoC support.
No functional changes intended.
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Tested-by: Gary Wang <gary.c.wang@intel.com>
Link: https://lore.kernel.org/r/20250310011411.31685-7-qiuxu.zhuo@intel.com
|
|
Use PCI_VDEVICE() to simplify the pci_device_id table.
No functional changes intended.
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Tested-by: Gary Wang <gary.c.wang@intel.com>
Link: https://lore.kernel.org/r/20250310011411.31685-6-qiuxu.zhuo@intel.com
|
|
The 3rd parameter of *populate_dimm_info() pertains to the DIMM index
within a channel, not the channel index. Fix the parameter name to dimm
to reflect its actual purpose.
No functional changes intended.
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Tested-by: Gary Wang <gary.c.wang@intel.com>
Link: https://lore.kernel.org/r/20250310011411.31685-5-qiuxu.zhuo@intel.com
|
|
The error path order of ie31200_init() is incorrect, fix it.
Fixes: 709ed1bcef12 ("EDAC/ie31200: Fallback if host bridge device is already initialized")
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Tested-by: Gary Wang <gary.c.wang@intel.com>
Link: https://lore.kernel.org/r/20250310011411.31685-4-qiuxu.zhuo@intel.com
|
|
The DIMM size mask for {Sky, Kaby, Coffee} Lake is not bits{7:0},
but bits{5:0}. Fix it.
Fixes: 953dee9bbd24 ("EDAC, ie31200_edac: Add Skylake support")
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Tested-by: Gary Wang <gary.c.wang@intel.com>
Link: https://lore.kernel.org/r/20250310011411.31685-3-qiuxu.zhuo@intel.com
|
|
The EDAC_MC_LAYER_CHIP_SELECT layer pertains to the rank, not the DIMM.
Fix its size to reflect the number of ranks instead of the number of DIMMs.
Also delete the unused macros IE31200_{DIMMS,RANKS}.
Fixes: 7ee40b897d18 ("ie31200_edac: Introduce the driver")
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Tested-by: Gary Wang <gary.c.wang@intel.com>
Link: https://lore.kernel.org/r/20250310011411.31685-2-qiuxu.zhuo@intel.com
|
|
The function nvmet_pci_epf_create_cq() always unconditionally calls
nvmet_pci_epf_add_irq_vector() to add an IRQ vector for a completion
queue. But this is not correct if the host requested the creation of a
completion queue for polling, without an IRQ vector specified (i.e. the
flag NVME_CQ_IRQ_ENABLED is not set).
Fix this by calling nvmet_pci_epf_add_irq_vector() and setting the queue
flag NVMET_PCI_EPF_Q_IRQ_ENABLED for the cq only if NVME_CQ_IRQ_ENABLED
is set. While at it, also fix the error path to add the missing removal
of the added IRQ vector if nvmet_cq_create() fails.
Fixes: 0faa0fe6f90e ("nvmet: New NVMe PCI endpoint function target driver")
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
The function nvmet_pci_epf_create_sq() use test_and_set_bit() to check
that a submission queue is not already live and if not, set the
NVMET_PCI_EPF_Q_LIVE queue flag to declare the sq live (ready to use).
However, this is done on entry to the function, before the submission
queue is actually fully initialized and ready to use. This creates a
race situation with the function nvmet_pci_epf_poll_sqs_work() which
looks at the NVMET_PCI_EPF_Q_LIVE queue flag to poll the submission
queue when it is live. This race can lead to invalid DMA transfers if
nvmet_pci_epf_poll_sqs_work() runs after the NVMET_PCI_EPF_Q_LIVE flag
is set but before setting the sq pci address and doorbell ofset.
Avoid this race by only testing the NVMET_PCI_EPF_Q_LIVE flag on entry
to nvmet_pci_epf_create_sq() and setting it after the submission queue
is fully setup before nvmet_pci_epf_create_sq() returns success.
Since the function nvmet_pci_epf_create_cq() also has the same racy flag
setting pattern, also make a similar change in that function.
Fixes: 0faa0fe6f90e ("nvmet: New NVMe PCI endpoint function target driver")
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
The VMBus driver manages the MMIO space it owns via the hyperv_mmio
resource tree. Because the synthetic video framebuffer portion of the
MMIO space is initially setup by the Hyper-V host for each guest, the
VMBus driver does an early reserve of that portion of MMIO space in the
hyperv_mmio resource tree. It saves a pointer to that resource in
fb_mmio. When a VMBus driver requests MMIO space and passes "true"
for the "fb_overlap_ok" argument, the reserved framebuffer space is
used if possible. In that case it's not necessary to do another request
against the "shadow" hyperv_mmio resource tree because that resource
was already requested in the early reserve steps.
However, the vmbus_free_mmio() function currently does no special
handling for the fb_mmio resource. When a framebuffer device is
removed, or the driver is unbound, the current code for
vmbus_free_mmio() releases the reserved resource, leaving fb_mmio
pointing to memory that has been freed. If the same or another
driver is subsequently bound to the device, vmbus_allocate_mmio()
checks against fb_mmio, and potentially gets garbage. Furthermore
a second unbind operation produces this "nonexistent resource" error
because of the unbalanced behavior between vmbus_allocate_mmio() and
vmbus_free_mmio():
[ 55.499643] resource: Trying to free nonexistent
resource <0x00000000f0000000-0x00000000f07fffff>
Fix this by adding logic to vmbus_free_mmio() to recognize when
MMIO space in the fb_mmio reserved area would be released, and don't
release it. This filtering ensures the fb_mmio resource always exists,
and makes vmbus_free_mmio() more parallel with vmbus_allocate_mmio().
Fixes: be000f93e5d7 ("drivers:hv: Track allocations of children of hv_vmbus in private resource tree")
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Link: https://lore.kernel.org/r/20250310035208.275764-1-mhklinux@outlook.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20250310035208.275764-1-mhklinux@outlook.com>
|
|
Video players (eg. mpv) do periodic XResetScreenSaver() calls to
keep the screen on while the video playing. The modesetting ddx
plumbs these straight through into the kernel as DPMS setproperty
ioctls, without any filtering whatsoever. When implemented via
atomic these end up as empty commits on the crtc (which will
nonetheless take one full frame), which leads to a dropped
frame every time XResetScreenSaver() is called.
Let's just filter out redundant DPMS property changes in the
kernel to avoid this issue.
v2: Explain the resulting commits a bit better (Sima)
Document the behaviour in uapi docs (Sima)
Cc: stable@vger.kernel.org
Testcase: igt/kms_flip/flip-vs-dpms-on-nop
Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250219160239.17502-1-ville.syrjala@linux.intel.com
|
|
The PCIe error handling has the nvme driver quiesce the device, attempt
to restart it, then wait for that restart to complete.
A PCIe DPC event also toggles the PCIe link. If the slot doesn't have
out-of-band presence detection, this will trigger a pciehp
re-enumeration.
The error handling that calls nvme_error_resume is holding the device
lock while this happens. This lock blocks pciehp's request to disconnect
the driver from proceeding.
Meanwhile the nvme's reset can't make forward progress because its
device isn't there anymore with outstanding IO, and the timeout handler
won't do anything to fix it because the device is undergoing error
handling.
End result: deadlocked.
Fix this by having the timeout handler short cut the disabling for a
disconnected PCIe device. The downside is that we're relying on an IO
timeout to clean up this mess, which could be a minute by default.
Tested-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Nilay Shroff <nilay@linux.ibm.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
The hd44780_common_alloc() uses hd for local variable while
the respective header uses hdc, rename to make it consistent
and avoid potential confuse with the drivers that use both
for different reasons. No functional changes intended.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
|
|
HD44780 APIs all operate on struct charlcd objects. Moreover, the current users
always call charlcd_alloc() and hd44780_common_alloc(). Make the latter call
the former, so eliminate the additional allocation, to make it consistent with
the rest of API and avoid duplication.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
|
|
Use the symmetrical API to free the common resources.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
|
|
Use the symmetrical API to free the common resources.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
|
|
Introduce hd44780_common_free() for symmetrical operation
to hd44780_common_alloc(). It will allow to modify both
in the future without touching the users.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
|
|
Allocate memory for custom data in charlcd_alloc() instead of doing that
explicitly in the driver.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
|
|
hd44780_common"
Commit 2545c1c948a6 ("auxdisplay: Move hwidth and bwidth to struct
hd44780_common") makes charlcd_alloc() argument-less effectively dropping
the single allocation for the struct charlcd_priv object along with
the driver specific one. Restore that behaviour here.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
|
|
In a rare situation of thermal limit during resume, GuC can
be slow and run into delays like this:
xe 0000:00:02.0: [drm] GT1: excessive init time: 667ms! \
[status = 0x8002F034, timeouts = 0]
xe 0000:00:02.0: [drm] GT1: excessive init time: \
[freq = 100MHz (req = 800MHz), before = 100MHz, \
perf_limit_reasons = 0x1C001000]
xe 0000:00:02.0: [drm] *ERROR* GT1: GuC PC Start failed
------------[ cut here ]------------
xe 0000:00:02.0: [drm] GT1: Failed to start GuC PC: -EIO
When this happens, it will block entirely the GPU to be used.
So, let's try and with a huge timeout in the hope it comes back.
Also, let's collect some information on how long it is usually
taking on situations like this, so perhaps the time can be tuned
later.
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250307160307.1093391-1-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit b4b05e53b550a886b4754b87fd0dd2b304579e85)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Currently, many instability cases related to D3Cold -> D0 transition
on BMG are under investigation. Among them some bad cases where
the device is lost after 1 to 3 transitions from D3Cold to D0
on the runtime pm, with pcieport upstream bridge port link retrain
failure.
In other cases, it works fine, but with some sudden random memory
corruptions after D3cold, that could be 0xffff missed ack on GT
forcewake or GuC reload related failures.
In some other cases though, D3Cold -> D0 works pretty reliably.
It looks like it is a combination of GPU cards and Host boards at
this point. So, there is no possible/available quirk at this time.
This patch disables the D3Cold by default on BMG by reducing the
vram_d3cold_threshold to 0. Users and developers who wants to enable
it are still able to via
$ echo 300 > /sys/bus/pci/devices/<addr>/vram_d3cold_threshold
Fixes: 3adcf970dc7e ("drm/xe/bmg: Drop force_probe requirement")
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4037
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4395
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4396
Cc: Karthik Poosa <karthik.poosa@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250308005636.1475420-1-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit d945cc876277851053c0cf37927c8d7bd9d0e880)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
We currently call intel_set_cdclk_post_plane_update() far
too early. When pipes are active during the reprogramming
the current spot only works for the cd2x divider update
case, as that is synchronize to the pipe's vblank. Squashing
and crawling are not synchronized in any way, so doing the
programming while the pipes/planes are potentially still using
the old hardware state could lead to underruns.
Move the post plane reprgramming to a spot where we know
that the pipes/planes have switched over the new hardware
state.
Cc: stable@vger.kernel.org
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250218211913.27867-2-ville.syrjala@linux.intel.com
Reviewed-by: Vinod Govindapillai <vinod.govindapillai@intel.com>
(cherry picked from commit fb64f5568c0e0b5730733d70a012ae26b1a55815)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
The assert incorrectly checks the total length processed which
can in fact be greater than the number of pages. Fix.
Fixes: 0a98219bcc96 ("drm/xe/hmm: Don't dereference struct page pointers without notifier lock")
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250307100109.21397-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit 70e5043ba85eae199b232e39921abd706b5c1fa4)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
A GT resets can be occurring in parallel while cancelling
work in async call which can requeue these workers.
to avoid that, lets first release guc ids and then cancel
work so they don't requeued.
Fixes: 8ae8a2e8dd21 ("drm/xe: Long running job update")
Fixes: 12c2f962fe71 ("drm/xe: cancel pending job timer before freeing scheduler")
Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250306131211.975503-1-tejas.upadhyay@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 8e8d76f62329127b31c64a034b052fb9e30e92af)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt into usb-linus
Mika writes:
thunderbolt: Fix for v6.14-rc7
This includes single USB4/Thunderbolt fix for v6.14-rc7:
- Fix use-after-free in resume from hibernate.
This has been in linux-next with no reported issues.
* tag 'thunderbolt-for-v6.14-rc7' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt:
thunderbolt: Prevent use-after-free in resume from hibernate
|
|
struct gpio_chip now has callbacks for setting line values that return
an integer, allowing to indicate failures. Convert the driver to using
them.
Link: https://lore.kernel.org/r/20250306-gpiochip-set-conversion-v2-2-a76e72e21425@linaro.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
Reduce the code complexity by using automatic lock guards with the I2C
mutex.
Link: https://lore.kernel.org/r/20250306-gpiochip-set-conversion-v2-1-a76e72e21425@linaro.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
struct gpio_chip now has callbacks for setting line values that return
an integer, allowing to indicate failures. Convert the driver to using
them.
Link: https://lore.kernel.org/r/20250303-gpiochip-set-conversion-v1-15-1d5cceeebf8b@linaro.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
Reduce the code complexity by using automatic lock guards with the raw
spinlock.
Link: https://lore.kernel.org/r/20250303-gpiochip-set-conversion-v1-14-1d5cceeebf8b@linaro.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
struct gpio_chip now has callbacks for setting line values that return
an integer, allowing to indicate failures. Convert the driver to using
them.
Link: https://lore.kernel.org/r/20250303-gpiochip-set-conversion-v1-13-1d5cceeebf8b@linaro.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
Reduce the code complexity by using automatic lock guards with the raw
spinlock.
Link: https://lore.kernel.org/r/20250303-gpiochip-set-conversion-v1-12-1d5cceeebf8b@linaro.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
struct gpio_chip now has callbacks for setting line values that return
an integer, allowing to indicate failures. Convert the driver to using
them.
Reviewed-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20250303-gpiochip-set-conversion-v1-11-1d5cceeebf8b@linaro.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
struct gpio_chip now has callbacks for setting line values that return
an integer, allowing to indicate failures. Convert the driver to using
them.
Link: https://lore.kernel.org/r/20250303-gpiochip-set-conversion-v1-10-1d5cceeebf8b@linaro.org
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|