Age | Commit message (Collapse) | Author |
|
Currently we treat EFAULT from hmm_range_fault() as a non-fatal error
when called from xe_vm_userptr_pin() with the idea that we want to avoid
killing the entire vm and chucking an error, under the assumption that
the user just did an unmap or something, and has no intention of
actually touching that memory from the GPU. At this point we have
already zapped the PTEs so any access should generate a page fault, and
if the pin fails there also it will then become fatal.
However it looks like it's possible for the userptr vma to still be on
the rebind list in preempt_rebind_work_func(), if we had to retry the
pin again due to something happening in the caller before we did the
rebind step, but in the meantime needing to re-validate the userptr and
this time hitting the EFAULT.
This explains an internal user report of hitting:
[ 191.738349] WARNING: CPU: 1 PID: 157 at drivers/gpu/drm/xe/xe_res_cursor.h:158 xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe]
[ 191.738551] Workqueue: xe-ordered-wq preempt_rebind_work_func [xe]
[ 191.738616] RIP: 0010:xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe]
[ 191.738690] Call Trace:
[ 191.738692] <TASK>
[ 191.738694] ? show_regs+0x69/0x80
[ 191.738698] ? __warn+0x93/0x1a0
[ 191.738703] ? xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe]
[ 191.738759] ? report_bug+0x18f/0x1a0
[ 191.738764] ? handle_bug+0x63/0xa0
[ 191.738767] ? exc_invalid_op+0x19/0x70
[ 191.738770] ? asm_exc_invalid_op+0x1b/0x20
[ 191.738777] ? xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe]
[ 191.738834] ? ret_from_fork_asm+0x1a/0x30
[ 191.738849] bind_op_prepare+0x105/0x7b0 [xe]
[ 191.738906] ? dma_resv_reserve_fences+0x301/0x380
[ 191.738912] xe_pt_update_ops_prepare+0x28c/0x4b0 [xe]
[ 191.738966] ? kmemleak_alloc+0x4b/0x80
[ 191.738973] ops_execute+0x188/0x9d0 [xe]
[ 191.739036] xe_vm_rebind+0x4ce/0x5a0 [xe]
[ 191.739098] ? trace_hardirqs_on+0x4d/0x60
[ 191.739112] preempt_rebind_work_func+0x76f/0xd00 [xe]
Followed by NPD, when running some workload, since the sg was never
actually populated but the vma is still marked for rebind when it should
be skipped for this special EFAULT case. This is confirmed to fix the
user report.
v2 (MattB):
- Move earlier.
v3 (MattB):
- Update the commit message to make it clear that this indeed fixes the
issue.
Fixes: 521db22a1d70 ("drm/xe: Invalidate userptr VMA on page pin fault")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: <stable@vger.kernel.org> # v6.10+
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250221143840.167150-5-matthew.auld@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 6b93cb98910c826c2e2004942f8b060311e43618)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
On error restore anything still on the pin_list back to the invalidation
list on error. For the actual pin, so long as the vma is tracked on
either list it should get picked up on the next pin, however it looks
possible for the vma to get nuked but still be present on this per vm
pin_list leading to corruption. An alternative might be then to instead
just remove the link when destroying the vma.
v2:
- Also add some asserts.
- Keep the overzealous locking so that we are consistent with the docs;
updating the docs and related bits will be done as a follow up.
Fixes: ed2bdf3b264d ("drm/xe/vm: Subclass userptr vmas")
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250221143840.167150-4-matthew.auld@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 4e37e928928b730de9aa9a2f5dc853feeebc1742)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Signed-off-by: Ken Raeburn <raeburn@redhat.com>
Signed-off-by: Matthew Sakai <msakai@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
|
|
When the kernel is compiled without LED framework support the
rtl8366rb fails to build like this:
rtl8366rb.o: in function `rtl8366rb_setup_led':
rtl8366rb.c:953:(.text.unlikely.rtl8366rb_setup_led+0xe8):
undefined reference to `led_init_default_state_get'
rtl8366rb.c:980:(.text.unlikely.rtl8366rb_setup_led+0x240):
undefined reference to `devm_led_classdev_register_ext'
As this is constantly coming up in different randconfig builds,
bite the bullet and create a separate file for the offending
code, split out a header with all stuff needed both in the
core driver and the leds code.
Add a new bool Kconfig option for the LED compile target, such
that it depends on LEDS_CLASS=y || LEDS_CLASS=RTL8366RB
which make LED support always available when LEDS_CLASS is
compiled into the kernel and enforce that if the LEDS_CLASS
is a module, then the RTL8366RB driver needs to be a module
as well so that modprobe can resolve the dependencies.
Fixes: 32d617005475 ("net: dsa: realtek: add LED drivers for rtl8366rb")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202502070525.xMUImayb-lkp@intel.com/
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fix from Wolfram Sang:
"Revert one cleanup which turned out to eat too much stack space"
* tag 'i2c-for-6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: core: Allocate temporary client dynamically
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC fix from Borislav Petkov:
- Have qcom_edac use the correct interrupt enable register to configure
the RAS interrupt lines
* tag 'edac_urgent_for_v6.14_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/qcom: Correct interrupt enable register configuration
|
|
While using nvme target with use_srq on, below kernel panic is noticed.
[ 549.698111] bnxt_en 0000:41:00.0 enp65s0np0: FEC autoneg off encoding: Clause 91 RS(544,514)
[ 566.393619] Oops: divide error: 0000 [#1] PREEMPT SMP NOPTI
..
[ 566.393799] <TASK>
[ 566.393807] ? __die_body+0x1a/0x60
[ 566.393823] ? die+0x38/0x60
[ 566.393835] ? do_trap+0xe4/0x110
[ 566.393847] ? bnxt_qplib_alloc_init_hwq+0x1d4/0x580 [bnxt_re]
[ 566.393867] ? bnxt_qplib_alloc_init_hwq+0x1d4/0x580 [bnxt_re]
[ 566.393881] ? do_error_trap+0x7c/0x120
[ 566.393890] ? bnxt_qplib_alloc_init_hwq+0x1d4/0x580 [bnxt_re]
[ 566.393911] ? exc_divide_error+0x34/0x50
[ 566.393923] ? bnxt_qplib_alloc_init_hwq+0x1d4/0x580 [bnxt_re]
[ 566.393939] ? asm_exc_divide_error+0x16/0x20
[ 566.393966] ? bnxt_qplib_alloc_init_hwq+0x1d4/0x580 [bnxt_re]
[ 566.393997] bnxt_qplib_create_srq+0xc9/0x340 [bnxt_re]
[ 566.394040] bnxt_re_create_srq+0x335/0x3b0 [bnxt_re]
[ 566.394057] ? srso_return_thunk+0x5/0x5f
[ 566.394068] ? __init_swait_queue_head+0x4a/0x60
[ 566.394090] ib_create_srq_user+0xa7/0x150 [ib_core]
[ 566.394147] nvmet_rdma_queue_connect+0x7d0/0xbe0 [nvmet_rdma]
[ 566.394174] ? lock_release+0x22c/0x3f0
[ 566.394187] ? srso_return_thunk+0x5/0x5f
Page size and shift info is set only for the user space SRQs.
Set page size and page shift for kernel space SRQs also.
Fixes: 0c4dcd602817 ("RDMA/bnxt_re: Refactor hardware queue memory allocation")
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://patch.msgid.link/1740237621-29291-1-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
When there is a failure during bind QP, the cleanup flow destroys the
counter regardless if it is the one that created it or not, which is
problematic since if it isn't the one that created it, that counter could
still be in use.
Fix that by destroying the counter only if it was created during this call.
Fixes: 45842fc627c7 ("IB/mlx5: Support statistic q counter configuration")
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Mark Zhang <markzhang@nvidia.com>
Link: https://patch.msgid.link/25dfefddb0ebefa668c32e06a94d84e3216257cf.1740033937.git.leon@kernel.org
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Ingo Molnar:
"Fix miscellaneous irqchip bugs"
* tag 'irq-urgent-2025-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/qcom-pdc: Workaround hardware register bug on X1E80100
irqchip/jcore-aic, clocksource/drivers/jcore: Fix jcore-pit interrupt request
irqchip/gic-v3: Fix rk3399 workaround when secure interrupts are enabled
|
|
drivers/i2c/i2c-core-base.c: In function ‘i2c_detect.isra’:
drivers/i2c/i2c-core-base.c:2544:1: warning: the frame size of 1312 bytes is larger than 1024 bytes [-Wframe-larger-than=]
2544 | }
| ^
Fix this by allocating the temporary client structure dynamically, as it
is a rather large structure (1216 bytes, depending on kernel config).
This is basically a revert of the to-be-fixed commit with some
checkpatch improvements.
Fixes: 735668f8e5c9 ("i2c: core: Allocate temp client on the stack in i2c_detect")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Su Hui <suhui@nfschina.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
[wsa: updated commit message, merged tags from similar patch]
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
|
|
Stats calculations involve a RMW to add the stat update to the existing
value. This is currently not protected by any synchronization mechanism,
so data races are possible. Add a spinlock to protect the update. The
reader side could be protected using u64_stats, but we would still need
a spinlock for the update side anyway. And we always do an update
immediately before reading the stats anyway.
Fixes: 89e5785fc8a6 ("[PATCH] Atmel MACB ethernet driver")
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Link: https://patch.msgid.link/20250220162950.95941-1-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
syzbot found that ipvlan_process_v6_outbound() was assuming
the IPv6 network header isis present in skb->head [1]
Add the needed pskb_network_may_pull() calls for both
IPv4 and IPv6 handlers.
[1]
BUG: KMSAN: uninit-value in __ipv6_addr_type+0xa2/0x490 net/ipv6/addrconf_core.c:47
__ipv6_addr_type+0xa2/0x490 net/ipv6/addrconf_core.c:47
ipv6_addr_type include/net/ipv6.h:555 [inline]
ip6_route_output_flags_noref net/ipv6/route.c:2616 [inline]
ip6_route_output_flags+0x51/0x720 net/ipv6/route.c:2651
ip6_route_output include/net/ip6_route.h:93 [inline]
ipvlan_route_v6_outbound+0x24e/0x520 drivers/net/ipvlan/ipvlan_core.c:476
ipvlan_process_v6_outbound drivers/net/ipvlan/ipvlan_core.c:491 [inline]
ipvlan_process_outbound drivers/net/ipvlan/ipvlan_core.c:541 [inline]
ipvlan_xmit_mode_l3 drivers/net/ipvlan/ipvlan_core.c:605 [inline]
ipvlan_queue_xmit+0xd72/0x1780 drivers/net/ipvlan/ipvlan_core.c:671
ipvlan_start_xmit+0x5b/0x210 drivers/net/ipvlan/ipvlan_main.c:223
__netdev_start_xmit include/linux/netdevice.h:5150 [inline]
netdev_start_xmit include/linux/netdevice.h:5159 [inline]
xmit_one net/core/dev.c:3735 [inline]
dev_hard_start_xmit+0x247/0xa20 net/core/dev.c:3751
sch_direct_xmit+0x399/0xd40 net/sched/sch_generic.c:343
qdisc_restart net/sched/sch_generic.c:408 [inline]
__qdisc_run+0x14da/0x35d0 net/sched/sch_generic.c:416
qdisc_run+0x141/0x4d0 include/net/pkt_sched.h:127
net_tx_action+0x78b/0x940 net/core/dev.c:5484
handle_softirqs+0x1a0/0x7c0 kernel/softirq.c:561
__do_softirq+0x14/0x1a kernel/softirq.c:595
do_softirq+0x9a/0x100 kernel/softirq.c:462
__local_bh_enable_ip+0x9f/0xb0 kernel/softirq.c:389
local_bh_enable include/linux/bottom_half.h:33 [inline]
rcu_read_unlock_bh include/linux/rcupdate.h:919 [inline]
__dev_queue_xmit+0x2758/0x57d0 net/core/dev.c:4611
dev_queue_xmit include/linux/netdevice.h:3311 [inline]
packet_xmit+0x9c/0x6c0 net/packet/af_packet.c:276
packet_snd net/packet/af_packet.c:3132 [inline]
packet_sendmsg+0x93e0/0xa7e0 net/packet/af_packet.c:3164
sock_sendmsg_nosec net/socket.c:718 [inline]
Fixes: 2ad7bf363841 ("ipvlan: Initial check-in of the IPVLAN driver.")
Reported-by: syzbot+93ab4a777bafb9d9f960@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/67b74f01.050a0220.14d86d.02d8.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Mahesh Bandewar <maheshb@google.com>
Link: https://patch.msgid.link/20250220155336.61884-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After commit 22600596b675 ("ipv4: give an IPv4 dev to blackhole_netdev")
IPv4 neighbors can be constructed on the blackhole net device, but they
are constructed with an output function (neigh_direct_output()) that
simply calls dev_queue_xmit(). The latter will transmit packets via
'skb->dev' which might not be the blackhole net device if dst_dev_put()
switched 'dst->dev' to the blackhole net device while another CPU was
using the dst entry in ip_output(), but after it already initialized
'skb->dev' from 'dst->dev'.
Specifically, the following can happen:
CPU1 CPU2
udp_sendmsg(sk1) udp_sendmsg(sk2)
udp_send_skb() [...]
ip_output()
skb->dev = skb_dst(skb)->dev
dst_dev_put()
dst->dev = blackhole_netdev
ip_finish_output2()
resolves neigh on dst->dev
neigh_output()
neigh_direct_output()
dev_queue_xmit()
This will result in IPv4 packets being sent without an Ethernet header
via a valid net device:
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on enp9s0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
22:07:02.329668 20:00:40:11:18:fb > 45:00:00:44:f4:94, ethertype Unknown
(0x58c6), length 68:
0x0000: 8dda 74ca f1ae ca6c ca6c 0098 969c 0400 ..t....l.l......
0x0010: 0000 4730 3f18 6800 0000 0000 0000 9971 ..G0?.h........q
0x0020: c4c9 9055 a157 0a70 9ead bf83 38ca ab38 ...U.W.p....8..8
0x0030: 8add ab96 e052 .....R
Fix by making sure that neighbors are constructed on top of the
blackhole net device with an output function that simply consumes the
packets, in a similar fashion to dst_discard_out() and
blackhole_netdev_xmit().
Fixes: 8d7017fd621d ("blackhole_netdev: use blackhole_netdev to invalidate dst entries")
Fixes: 22600596b675 ("ipv4: give an IPv4 dev to blackhole_netdev")
Reported-by: Florian Meister <fmei@sfs.com>
Closes: https://lore.kernel.org/netdev/20250210084931.23a5c2e4@hermes.local/
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250220072559.782296-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- btusb: Always allow SCO packets for user channel
- L2CAP: Fix L2CAP_ECRED_CONN_RSP response
* tag 'for-net-2025-02-21' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: L2CAP: Fix L2CAP_ECRED_CONN_RSP response
Bluetooth: Always allow SCO packets for user channel
====================
Link: https://patch.msgid.link/20250221154941.2139043-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC fixes from Arnd Bergmann:
"Two people stepped up as platform co-maintainers: Andrew Jeffery for
ASpeed and Janne Grunau for Apple.
The rockchip platform gets 9 small fixes for devicetree files,
addressing both compile-time warnings and board specific bugs.
One bugfix for the optee firmware driver addresses a reboot-time hang.
Two drivers need improved Kconfig dependencies to allow wider compile-
testing while hiding the drivers on platforms that can't use them.
ARM SCMI and loongson-guts drivers get minor bugfixes"
* tag 'soc-fixes-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
soc: loongson: loongson2_guts: Add check for devm_kstrdup()
tee: optee: Fix supplicant wait loop
platform: cznic: CZNIC_PLATFORMS should depend on ARCH_MVEBU
firmware: imx: IMX_SCMI_MISC_DRV should depend on ARCH_MXC
MAINTAINERS: arm: apple: Add Janne as maintainer
MAINTAINERS: Mark Andrew as M: for ASPEED MACHINE SUPPORT
firmware: arm_scmi: imx: Correct tx size of scmi_imx_misc_ctrl_set
arm64: dts: rockchip: adjust SMMU interrupt type on rk3588
arm64: dts: rockchip: disable IOMMU when running rk3588 in PCIe endpoint mode
dt-bindings: rockchip: pmu: Ensure all properties are defined
arm64: defconfig: Enable TISCI Interrupt Router and Aggregator
arm64: dts: rockchip: Fix lcdpwr_en pin for Cool Pi GenBook
arm64: dts: rockchip: fix fixed-regulator renames on rk3399-gru devices
arm64: dts: rockchip: Disable DMA for uart5 on px30-ringneck
arm64: dts: rockchip: Move uart5 pin configuration to px30 ringneck SoM
arm64: dts: rockchip: change eth phy mode to rgmii-id for orangepi r1 plus lts
arm64: dts: rockchip: Fix broken tsadc pinctrl names for rk3588
|
|
Pull drm fixes from Dave Airlie:
"Weekly drm fixes pull request, lots of small things all over, msm has
a bunch of things but all very small, xe, i915, a fix for the cgroup
dmem controller.
core:
- remove MAINTAINERS entry
cgroup/dmem:
- use correct function for pool descendants
panel:
- fix signal polarity issue jd9365da-h3
nouveau:
- folio handling fix
- config fix
amdxdna:
- fix missing header
xe:
- Fix error handling in xe_irq_install
- Fix devcoredump format
i915:
- Use spin_lock_irqsave() in interruptible context on guc submission
- Fixes on DDI and TRANS programming
- Make sure all planes in use by the joiner have their crtc included
- Fix 128b/132b modeset issues
msm:
- More catalog fixes:
- to skip watchdog programming through top block if its not
present
- fix the setting of WB mask to ensure the WB input control is
programmed correctly through ping-pong
- drop lm_pair for sm6150 as that chipset does not have any
3dmerge block
- Fix the mode validation logic for DP/eDP to account for widebus
(2ppc) to allow high clock resolutions
- Fix to disable dither during encoder disable as otherwise this
was causing kms_writeback failure due to resource sharing
between WB and DSI paths as DSI uses dither but WB does not
- Fixes for virtual planes, namely to drop extraneous return and
fix uninitialized variables
- Fix to avoid spill-over of DSC encoder block bits when
programming the bits-per-component
- Fixes in the DSI PHY to protect against concurrent access of
PHY_CMN_CLK_CFG regs between clock and display drivers
- Core/GPU:
- Fix non-blocking fence wait incorrectly rounding up to 1 jiffy
timeout
- Only print GMU fw version once, instead of each time the GPU
resumes"
* tag 'drm-fixes-2025-02-22' of https://gitlab.freedesktop.org/drm/kernel: (28 commits)
drm/i915/dp: Fix disabling the transcoder function in 128b/132b mode
drm/i915/dp: Fix error handling during 128b/132b link training
accel/amdxdna: Add missing include linux/slab.h
MAINTAINERS: Remove myself
drm/nouveau/pmu: Fix gp10b firmware guard
cgroup/dmem: Don't open-code css_for_each_descendant_pre
drm/xe/guc: Fix size_t print format
drm/xe: Make GUC binaries dump consistent with other binaries in devcoredump
drm/i915: Make sure all planes in use by the joiner have their crtc included
drm/i915/ddi: Fix HDMI port width programming in DDI_BUF_CTL
drm/i915/dsi: Use TRANS_DDI_FUNC_CTL's own port width macro
drm/xe: Fix error handling in xe_irq_install()
drm/i915/gt: Use spin_lock_irqsave() in interruptible context
drm/msm/dsi/phy: Do not overwite PHY_CMN_CLK_CFG1 when choosing bitclk source
drm/msm/dsi/phy: Protect PHY_CMN_CLK_CFG1 against clock driver
drm/msm/dsi/phy: Protect PHY_CMN_CLK_CFG0 updated from driver side
drm/msm/dpu: Drop extraneous return in dpu_crtc_reassign_planes()
drm/msm/dpu: Don't leak bits_per_component into random DSC_ENC fields
drm/msm/dpu: Disable dither in phys encoder cleanup
drm/msm/dpu: Fix uninitialized variable
...
|
|
Pull block fixes from Jens Axboe:
- NVMe pull request via Keith:
- FC controller state check fixes (Daniel)
- PCI Endpoint fixes (Damien)
- TCP connection failure fixe (Caleb)
- TCP handling C2HTermReq PDU (Maurizio)
- RDMA queue state check (Ruozhu)
- Apple controller fixes (Hector)
- Target crash on disbaled namespace (Hannes)
- MD pull request via Yu:
- Fix queue limits error handling for raid0, raid1 and raid10
- Fix for a NULL pointer deref in request data mapping
- Code cleanup for request merging
* tag 'block-6.14-20250221' of git://git.kernel.dk/linux:
nvme: only allow entering LIVE from CONNECTING state
nvme-fc: rely on state transitions to handle connectivity loss
apple-nvme: Support coprocessors left idle
apple-nvme: Release power domains when probe fails
nvmet: Use enum definitions instead of hardcoded values
nvme: Cleanup the definition of the controller config register fields
nvme/ioctl: add missing space in err message
nvme-tcp: fix connect failure on receiving partial ICResp PDU
nvme: tcp: Fix compilation warning with W=1
nvmet: pci-epf: Avoid RCU stalls under heavy workload
nvmet: pci-epf: Do not uselessly write the CSTS register
nvmet: pci-epf: Correctly initialize CSTS when enabling the controller
nvmet-rdma: recheck queue state is LIVE in state lock in recv done
nvmet: Fix crash when a namespace is disabled
nvme-tcp: add basic support for the C2HTermReq PDU
nvme-pci: quirk Acer FA100 for non-uniqueue identifiers
block: fix NULL pointer dereferenced within __blk_rq_map_sg
block/merge: remove unnecessary min() with UINT_MAX
md/raid*: Fix the set_queue_limits implementations
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"Fix a memory leak in the ACPI platform_profile driver (Kurt Borja)"
* tag 'acpi-6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: platform_profile: Fix memory leak in profile_class_is_visible()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull mtd fixes from Miquel Raynal:
"The two most important fixes in this list are probably the SST write
failure and the Qcom raw NAND controller probe failure which are due
to some refactoring, otherwise there has been a series of misc fixes
on the Cadence raw NAND controller driver and especially on the DMA
side"
* tag 'mtd/fixes-for-6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
mtd: rawnand: cadence: fix unchecked dereference
mtd: spi-nor: sst: Fix SST write failure
dt-bindings: mtd: cadence: document required clock-names
mtd: rawnand: qcom: fix broken config in qcom_param_page_type_exec
mtd: rawnand: cadence: fix incorrect device in dma_unmap_single
mtd: rawnand: cadence: use dma_map_resource for sdma address
mtd: rawnand: cadence: fix error code in cadence_nand_init()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
"There are two fixes for GPIO core: one adds missing retval checks to
older code, while the second adds SRCU synchronization to legs in code
that were missed during the big rework a few cycles back. There's also
one small driver fix:
- check the return value of the get_direction() callback in struct
gpio_chip
- protect the multi-line get/set legs in GPIO core with SRCU
- fix a race condition in gpio-vf610"
* tag 'gpio-fixes-for-v6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpiolib: don't bail out if get_direction() fails in gpiochip_add_data()
gpiolib: protect gpio_chip with SRCU in array_info paths in multi get/set
gpio: vf610: add locking to gpio direction functions
gpiolib: check the return value of gpio_chip::get_direction()
|
|
On X1E80100, there is a hardware bug in the register logic of the
IRQ_ENABLE_BANK register: While read accesses work on the normal address,
all write accesses must be made to a shifted address. Without a workaround
for this, the wrong interrupt gets enabled in the PDC and it is impossible
to wakeup from deep suspend (CX collapse). This has not caused problems so
far, because the deep suspend state was not enabled. A workaround is
required now since work is ongoing to fix this.
The PDC has multiple "DRV" regions, each one has a size of 0x10000 and
provides the same set of registers for a particular client in the system.
Linux is one the clients and uses DRV region 2 on X1E. Each "bank" inside
the DRV region consists of 32 interrupt pins that can be enabled using the
IRQ_ENABLE_BANK register:
IRQ_ENABLE_BANK[bank] = base + IRQ_ENABLE_BANK + bank * sizeof(u32)
On X1E, this works as intended for read access. However, write access to
most banks is shifted by 2:
IRQ_ENABLE_BANK_X1E[0] = IRQ_ENABLE_BANK[-2]
IRQ_ENABLE_BANK_X1E[1] = IRQ_ENABLE_BANK[-1]
IRQ_ENABLE_BANK_X1E[2] = IRQ_ENABLE_BANK[0] = IRQ_ENABLE_BANK[2 - 2]
IRQ_ENABLE_BANK_X1E[3] = IRQ_ENABLE_BANK[1] = IRQ_ENABLE_BANK[3 - 2]
IRQ_ENABLE_BANK_X1E[4] = IRQ_ENABLE_BANK[2] = IRQ_ENABLE_BANK[4 - 2]
IRQ_ENABLE_BANK_X1E[5] = IRQ_ENABLE_BANK[5] (this one works as intended)
The negative indexes underflow to banks of the previous DRV/client region:
IRQ_ENABLE_BANK_X1E[drv 2][bank 0] = IRQ_ENABLE_BANK[drv 2][bank -2]
= IRQ_ENABLE_BANK[drv 1][bank 5-2]
= IRQ_ENABLE_BANK[drv 1][bank 3]
= IRQ_ENABLE_BANK[drv 1][bank 0 + 3]
IRQ_ENABLE_BANK_X1E[drv 2][bank 1] = IRQ_ENABLE_BANK[drv 2][bank -1]
= IRQ_ENABLE_BANK[drv 1][bank 5-1]
= IRQ_ENABLE_BANK[drv 1][bank 4]
= IRQ_ENABLE_BANK[drv 1][bank 1 + 3]
Introduce a workaround for the bug by matching the qcom,x1e80100-pdc
compatible and apply the offsets as shown above:
- Bank 0...1: previous DRV region, bank += 3
- Bank 1...4: our DRV region, bank -= 2
- Bank 5: our DRV region, no fixup required
The PDC node in the device tree only describes the DRV region for the Linux
client, but the workaround also requires to map parts of the previous DRV
region to issue writes there. To maintain compatibility with old device
trees, obtain the base address of the preceeding region by applying the
-0x10000 offset. Note that this is also more correct from a conceptual
point of view:
It does not really make use of the other region; it just issues shifted
writes that end up in the registers of the Linux associated DRV region 2.
Signed-off-by: Stephan Gerhold <stephan.gerhold@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/all/20250218-x1e80100-pdc-hw-wa-v2-1-29be4c98e355@linaro.org
|
|
If the device doesn't support arpmb we'll crash due to copying user data in
bsg_transport_sg_io_fn().
In the case where ufs_bsg_exec_advanced_rpmb_req() returns an error, do not
set the job's reply_len.
Memory crash backtrace:
3,1290,531166405,-;ufshcd 0000:00:12.5: ARPMB OP failed: error code -22
4,1308,531166555,-;Call Trace:
4,1309,531166559,-; <TASK>
4,1310,531166565,-; ? show_regs+0x6d/0x80
4,1311,531166575,-; ? die+0x37/0xa0
4,1312,531166583,-; ? do_trap+0xd4/0xf0
4,1313,531166593,-; ? do_error_trap+0x71/0xb0
4,1314,531166601,-; ? usercopy_abort+0x6c/0x80
4,1315,531166610,-; ? exc_invalid_op+0x52/0x80
4,1316,531166622,-; ? usercopy_abort+0x6c/0x80
4,1317,531166630,-; ? asm_exc_invalid_op+0x1b/0x20
4,1318,531166643,-; ? usercopy_abort+0x6c/0x80
4,1319,531166652,-; __check_heap_object+0xe3/0x120
4,1320,531166661,-; check_heap_object+0x185/0x1d0
4,1321,531166670,-; __check_object_size.part.0+0x72/0x150
4,1322,531166679,-; __check_object_size+0x23/0x30
4,1323,531166688,-; bsg_transport_sg_io_fn+0x314/0x3b0
Fixes: 6ff265fc5ef6 ("scsi: ufs: core: bsg: Add advanced RPMB support in ufs_bsg")
Cc: stable@vger.kernel.org
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Arthur Simchaev <arthur.simchaev@sandisk.com>
Link: https://lore.kernel.org/r/20250220142039.250992-1-arthur.simchaev@sandisk.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Commit bb9850704c04 ("scsi: ufs: core: Honor runtime/system PM levels if
set by host controller drivers") introduced the check for setting default
PM levels only if the levels are uninitialized by the host controller
drivers. But it missed the fact that the levels could be initialized to 0
(UFS_PM_LVL_0) on purpose by the controller drivers. Even though none of
the drivers are doing so now, the logic should be fixed irrespectively.
So set the default levels unconditionally before calling ufshcd_hba_init()
API which initializes the controller drivers. It ensures that the
controller drivers could override the default levels if required.
Fixes: bb9850704c04 ("scsi: ufs: core: Honor runtime/system PM levels if set by host controller drivers")
Reported-by: Bao D. Nguyen <quic_nguyenb@quicinc.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20250219105047.49932-1-manivannan.sadhasivam@linaro.org
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
After commit 1bad6c4a57ef ("scsi: zero per-cmd private driver data for each
MQ I/O"), the xen-scsifront/virtio_scsi/snic drivers all removed code that
explicitly zeroed driver-private command data.
In combination with commit 464a00c9e0ad ("scsi: core: Kill DRIVER_SENSE"),
after virtio_scsi performs a capacity expansion, the first request will
return a unit attention to indicate that the capacity has changed. And then
the original command is retried. As driver-private command data was not
cleared, the request would return UA again and eventually time out and fail.
Zero driver-private command data when a request is retried.
Fixes: f7de50da1479 ("scsi: xen-scsifront: Remove code that zeroes driver-private command data")
Fixes: c2bb87318baa ("scsi: virtio_scsi: Remove code that zeroes driver-private command data")
Fixes: c3006a926468 ("scsi: snic: Remove code that zeroes driver-private command data")
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20250217021628.2929248-1-yebin@huaweicloud.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply
Pull power supply fixes from Sebastian Reichel:
- core: Fix extension related lockdep warning for LED triggers
- axp20x-battery: Fix fault handling for AXP717
- da9150-fg: fix potential overflow
* tag 'for-v6.14-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
power: supply: axp20x_battery: Fix fault handling for AXP717
power: supply: core: Fix extension related lockdep warning
power: supply: da9150-fg: fix potential overflow
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
Pull ata fix from Niklas Cassel:
- Fix an unintentional masking of AHCI ports when the device tree does
not define port child nodes (Damien)
* tag 'ata-6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ata: libahci_platform: Do not set mask_port_map when not needed
|
|
Loongson's DWMAC device may take nearly two seconds to complete DMA reset,
however, the default waiting time for reset is 200 milliseconds.
Therefore, the following error message may appear:
[14.427169] dwmac-loongson-pci 0000:00:03.2: Failed to reset the dma
Fixes: 803fc61df261 ("net: stmmac: dwmac-loongson: Add Loongson Multi-channels GMAC support")
Cc: stable@vger.kernel.org
Signed-off-by: Qunqin Zhao <zhaoqunqin@loongson.cn>
Reviewed-by: Huacai Chen <chenhuacai@loongson.cn>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Yanteng Si <si.yanteng@linux.dev>
Link: https://patch.msgid.link/20250219020701.15139-1-zhaoqunqin@loongson.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://gitlab.freedesktop.org/drm/msm into drm-fixes
Fixes for v6.14-rc4
Display:
* More catalog fixes:
- to skip watchdog programming through top block if its not present
- fix the setting of WB mask to ensure the WB input control is programmed
correctly through ping-pong
- drop lm_pair for sm6150 as that chipset does not have any 3dmerge block
* Fix the mode validation logic for DP/eDP to account for widebus (2ppc)
to allow high clock resolutions
* Fix to disable dither during encoder disable as otherwise this was
causing kms_writeback failure due to resource sharing between
* WB and DSI paths as DSI uses dither but WB does not
* Fixes for virtual planes, namely to drop extraneous return and fix
uninitialized variables
* Fix to avoid spill-over of DSC encoder block bits when programming
the bits-per-component
* Fixes in the DSI PHY to protect against concurrent access of
PHY_CMN_CLK_CFG regs between clock and display drivers
Core/GPU:
* Fix non-blocking fence wait incorrectly rounding up to 1 jiffy timeout
* Only print GMU fw version once, instead of each time the GPU resumes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGtt2AODBXdod8ULXcAygf_qYvwRDVeUVtODx=2jErp6cA@mail.gmail.com
|
|
https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes
- Use spin_lock_irqsave() in interruptible context on guc submission (Krzysztof)
- Fixes on DDI and TRANS programming (Imre)
- Make sure all planes in use by the joiner have their crtc included (Ville)
- Fix 128b/132b modeset issues (Imre)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Z7dgcUG_hvityvHn@intel.com
|
|
Pull NVMe fixes from Keith:
"nvme fixes for Linux 6.14
- FC controller state check fixes (Daniel)
- PCI Endpoint fixes (Damien)
- TCP connection failure fixe (Caleb)
- TCP handling C2HTermReq PDU (Maurizio)
- RDMA queue state check (Ruozhu)
- Apple controller fixes (Hector)
- Target crash on disbaled namespace (Hannes)"
* tag 'nvme-6.14-2025-02-20' of git://git.infradead.org/nvme:
nvme: only allow entering LIVE from CONNECTING state
nvme-fc: rely on state transitions to handle connectivity loss
apple-nvme: Support coprocessors left idle
apple-nvme: Release power domains when probe fails
nvmet: Use enum definitions instead of hardcoded values
nvme: Cleanup the definition of the controller config register fields
nvme/ioctl: add missing space in err message
nvme-tcp: fix connect failure on receiving partial ICResp PDU
nvme: tcp: Fix compilation warning with W=1
nvmet: pci-epf: Avoid RCU stalls under heavy workload
nvmet: pci-epf: Do not uselessly write the CSTS register
nvmet: pci-epf: Correctly initialize CSTS when enabling the controller
nvmet-rdma: recheck queue state is LIVE in state lock in recv done
nvmet: Fix crash when a namespace is disabled
nvme-tcp: add basic support for the C2HTermReq PDU
nvme-pci: quirk Acer FA100 for non-uniqueue identifiers
|
|
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
- Fix error handling in xe_irq_install (Lucas)
- Fix devcoredump format (Jose, Lucas)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Z7dePS3a9POnjrVL@intel.com
|
|
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
An reset signal polarity fix for the jd9365da-h3 panel, a folio handling
fix and config fix in nouveau, a dmem cgroup descendant pool handling
fix, and a missing header for amdxdna.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250220-glorious-cockle-of-might-5b35f7@houat
|
|
Add check for the return value of devm_kstrdup() in
loongson2_guts_probe() to catch potential exception.
Fixes: b82621ac8450 ("soc: loongson: add GUTS driver for loongson-2 platforms")
Cc: stable@vger.kernel.org
Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com>
Link: https://lore.kernel.org/r/20250220081714.2676828-1-haoxiang_li2024@163.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes
Arm SCMI fix for v6.14
Just a single fix to address the incorrect size of the Tx buffer in the
function scmi_imx_misc_ctrl_set() which is part of NXP/i.MX SCMI vendor
extensions.
* tag 'scmi-fix-6.14' of https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
firmware: arm_scmi: imx: Correct tx size of scmi_imx_misc_ctrl_set
Link: https://lore.kernel.org/r/20250217155246.1668182-1-sudeep.holla@arm.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
While setting the DAC value, the wrong boolean value is evaluated to set
the DSP bias current. So let's correct the conditional statement and use
the right boolean value read from the DTS set in the priv.
Cc: stable@vger.kernel.org
Fixes: d1cb613efbd3 ("net: phy: qcom: add support for QCA807x PHY Family")
Signed-off-by: George Moussalem <george.moussalem@outlook.com>
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250219130923.7216-1-ansuelsmth@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The SCO packets from Bluetooth raw socket are now rejected because
hci_conn_num is left 0. This patch allows such the usecase to enable
the userspace SCO support.
Fixes: b16b327edb4d ("Bluetooth: btusb: add sysfs attribute to control USB alt setting")
Signed-off-by: Hsin-chen Chuang <chharry@chromium.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Smaller than usual with no fixes from any subtree.
Current release - regressions:
- core: fix race of rtnl_net_lock(dev_net(dev))
Previous releases - regressions:
- core: remove the single page frag cache for good
- flow_dissector: fix handling of mixed port and port-range keys
- sched: cls_api: fix error handling causing NULL dereference
- tcp:
- adjust rcvq_space after updating scaling ratio
- drop secpath at the same time as we currently drop dst
- eth: gtp: suppress list corruption splat in gtp_net_exit_batch_rtnl().
Previous releases - always broken:
- vsock:
- fix variables initialization during resuming
- for connectible sockets allow only connected
- eth:
- geneve: fix use-after-free in geneve_find_dev()
- ibmvnic: don't reference skb after sending to VIOS"
* tag 'net-6.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (34 commits)
Revert "net: skb: introduce and use a single page frag cache"
net: allow small head cache usage with large MAX_SKB_FRAGS values
nfp: bpf: Add check for nfp_app_ctrl_msg_alloc()
tcp: drop secpath at the same time as we currently drop dst
net: axienet: Set mac_managed_pm
arp: switch to dev_getbyhwaddr() in arp_req_set_public()
net: Add non-RCU dev_getbyhwaddr() helper
sctp: Fix undefined behavior in left shift operation
selftests/bpf: Add a specific dst port matching
flow_dissector: Fix port range key handling in BPF conversion
selftests/net/forwarding: Add a test case for tc-flower of mixed port and port-range
flow_dissector: Fix handling of mixed port and port-range keys
geneve: Suppress list corruption splat in geneve_destroy_tunnels().
gtp: Suppress list corruption splat in gtp_net_exit_batch_rtnl().
dev: Use rtnl_net_dev_lock() in unregister_netdev().
net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net().
net: Add net_passive_inc() and net_passive_dec().
net: pse-pd: pd692x0: Fix power limit retrieval
MAINTAINERS: trim the GVE entry
gve: set xdp redirect target only when it is available
...
|
|
The fabric transports and also the PCI transport are not entering the
LIVE state from NEW or RESETTING. This makes the state machine more
restrictive and allows to catch not supported state transitions, e.g.
directly switching from RESETTING to LIVE.
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Daniel Wagner <wagi@kernel.org>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
It's not possible to call nvme_state_ctrl_state with holding a spin
lock, because nvme_state_ctrl_state calls cancel_delayed_work_sync
when fastfail is enabled.
Instead syncing the ASSOC_FLAG and state transitions using a lock, it's
possible to only rely on the state machine transitions. That means
nvme_fc_ctrl_connectivity_loss should unconditionally call
nvme_reset_ctrl which avoids the read race on the ctrl state variable.
Actually, it's not necessary to test in which state the ctrl is, the
reset work will only scheduled when the state machine is in LIVE state.
In nvme_fc_create_association, the LIVE state can only be entered if it
was previously CONNECTING. If this is not possible then the reset
handler got triggered. Thus just error out here.
Fixes: ee59e3820ca9 ("nvme-fc: do not ignore connectivity loss during connecting")
Closes: https://lore.kernel.org/all/denqwui6sl5erqmz2gvrwueyxakl5txzbbiu3fgebryzrfxunm@iwxuthct377m/
Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Daniel Wagner <wagi@kernel.org>
Signed-off-by: Keith Busch <kbusch@kernel.org>
|
|
Since commit 9d846b1aebbe ("gpiolib: check the return value of
gpio_chip::get_direction()") we check the return value of the
get_direction() callback as per its API contract. Some drivers have been
observed to fail to register now as they may call get_direction() in
gpiochip_add_data() in contexts where it has always silently failed.
Until we audit all drivers, replace the bail-out to a kernel log
warning.
Fixes: 9d846b1aebbe ("gpiolib: check the return value of gpio_chip::get_direction()")
Reported-by: Mark Brown <broonie@kernel.org>
Closes: https://lore.kernel.org/all/Z7VFB1nST6lbmBIo@finisterre.sirena.org.uk/
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Closes: https://lore.kernel.org/all/dfe03f88-407e-4ef1-ad30-42db53bbd4e4@samsung.com/
Tested-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20250219144356.258635-1-brgl@bgdev.pl
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
Add check for the return value of nfp_app_ctrl_msg_alloc() in
nfp_bpf_cmsg_alloc() to prevent null pointer dereference.
Fixes: ff3d43f7568c ("nfp: bpf: implement helpers for FW map ops")
Cc: stable@vger.kernel.org
Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com>
Link: https://patch.msgid.link/20250218030409.2425798-1-haoxiang_li2024@163.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The external PHY will undergo a soft reset twice during the resume process
when it wake up from suspend. The first reset occurs when the axienet
driver calls phylink_of_phy_connect(), and the second occurs when
mdio_bus_phy_resume() invokes phy_init_hw(). The second soft reset of the
external PHY does not reinitialize the internal PHY, which causes issues
with the internal PHY, resulting in the PHY link being down. To prevent
this, setting the mac_managed_pm flag skips the mdio_bus_phy_resume()
function.
Fixes: a129b41fe0a8 ("Revert "net: phy: dp83867: perform soft reset and retain established link"")
Signed-off-by: Nick Hu <nick.hu@sifive.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250217055843.19799-1-nick.hu@sifive.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Previously static rate wasn't translated according to our PRM but simply
used the 4 lower bytes.
Correctly translate static rate value passed in AH creation attribute
according to our PRM expected values.
In addition change 800GB mapping to zero, which is the PRM
specified value.
Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Link: https://patch.msgid.link/18ef4cc5396caf80728341eb74738cd777596f60.1739187089.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
Fix the destroy_unused_implicit_child_mr() to prevent hanging during
parent deregistration as of below [1].
Upon entering destroy_unused_implicit_child_mr(), the reference count
for the implicit MR parent is incremented using:
refcount_inc_not_zero().
A corresponding decrement must be performed if
free_implicit_child_mr_work() is not called.
The code has been updated to properly manage the reference count that
was incremented.
[1]
INFO: task python3:2157 blocked for more than 120 seconds.
Not tainted 6.12.0-rc7+ #1633
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:python3 state:D stack:0 pid:2157 tgid:2157 ppid:1685 flags:0x00000000
Call Trace:
<TASK>
__schedule+0x420/0xd30
schedule+0x47/0x130
__mlx5_ib_dereg_mr+0x379/0x5d0 [mlx5_ib]
? __pfx_autoremove_wake_function+0x10/0x10
ib_dereg_mr_user+0x5f/0x120 [ib_core]
? lock_release+0xc6/0x280
destroy_hw_idr_uobject+0x1d/0x60 [ib_uverbs]
uverbs_destroy_uobject+0x58/0x1d0 [ib_uverbs]
uobj_destroy+0x3f/0x70 [ib_uverbs]
ib_uverbs_cmd_verbs+0x3e4/0xbb0 [ib_uverbs]
? __pfx_uverbs_destroy_def_handler+0x10/0x10 [ib_uverbs]
? lock_acquire+0xc1/0x2f0
? ib_uverbs_ioctl+0xcb/0x170 [ib_uverbs]
? ib_uverbs_ioctl+0x116/0x170 [ib_uverbs]
? lock_release+0xc6/0x280
ib_uverbs_ioctl+0xe7/0x170 [ib_uverbs]
? ib_uverbs_ioctl+0xcb/0x170 [ib_uverbs]
__x64_sys_ioctl+0x1b0/0xa70
? kmem_cache_free+0x221/0x400
do_syscall_64+0x6b/0x140
entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f20f21f017b
RSP: 002b:00007ffcfc4a77c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ffcfc4a78d8 RCX: 00007f20f21f017b
RDX: 00007ffcfc4a78c0 RSI: 00000000c0181b01 RDI: 0000000000000003
RBP: 00007ffcfc4a78a0 R08: 000056147d125190 R09: 00007f20f1f14c60
R10: 0000000000000001 R11: 0000000000000246 R12: 00007ffcfc4a7890
R13: 000000000000001c R14: 000056147d100fc0 R15: 00007f20e365c9d0
</TASK>
Fixes: d3d930411ce3 ("RDMA/mlx5: Fix implicit ODP use after free")
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Reviewed-by: Artemy Kovalyov <artemyko@nvidia.com>
Link: https://patch.msgid.link/80f2fcd19952dfa7d9981d93fd6359b4471f8278.1739186929.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
As explained in the previous patch, iterating for_each_netdev() and
gn->geneve_list during ->exit_batch_rtnl() could trigger ->dellink()
twice for the same device.
If CONFIG_DEBUG_LIST is enabled, we will see a list_del() corruption
splat in the 2nd call of geneve_dellink().
Let's remove for_each_netdev() in geneve_destroy_tunnels() and delegate
that part to default_device_exit_batch().
Fixes: 9593172d93b9 ("geneve: Fix use-after-free in geneve_find_dev().")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250217203705.40342-3-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Brad Spengler reported the list_del() corruption splat in
gtp_net_exit_batch_rtnl(). [0]
Commit eb28fd76c0a0 ("gtp: Destroy device along with udp socket's netns
dismantle.") added the for_each_netdev() loop in gtp_net_exit_batch_rtnl()
to destroy devices in each netns as done in geneve and ip tunnels.
However, this could trigger ->dellink() twice for the same device during
->exit_batch_rtnl().
Say we have two netns A & B and gtp device B that resides in netns B but
whose UDP socket is in netns A.
1. cleanup_net() processes netns A and then B.
2. gtp_net_exit_batch_rtnl() finds the device B while iterating
netns A's gn->gtp_dev_list and calls ->dellink().
[ device B is not yet unlinked from netns B
as unregister_netdevice_many() has not been called. ]
3. gtp_net_exit_batch_rtnl() finds the device B while iterating
netns B's for_each_netdev() and calls ->dellink().
gtp_dellink() cleans up the device's hash table, unlinks the dev from
gn->gtp_dev_list, and calls unregister_netdevice_queue().
Basically, calling gtp_dellink() multiple times is fine unless
CONFIG_DEBUG_LIST is enabled.
Let's remove for_each_netdev() in gtp_net_exit_batch_rtnl() and
delegate the destruction to default_device_exit_batch() as done
in bareudp.
[0]:
list_del corruption, ffff8880aaa62c00->next (autoslab_size_M_dev_P_net_core_dev_11127_8_1328_8_S_4096_A_64_n_139+0xc00/0x1000 [slab object]) is LIST_POISON1 (ffffffffffffff02) (prev is 0xffffffffffffff04)
kernel BUG at lib/list_debug.c:58!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 UID: 0 PID: 1804 Comm: kworker/u8:7 Tainted: G T 6.12.13-grsec-full-20250211091339 #1
Tainted: [T]=RANDSTRUCT
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Workqueue: netns cleanup_net
RIP: 0010:[<ffffffff84947381>] __list_del_entry_valid_or_report+0x141/0x200 lib/list_debug.c:58
Code: c2 76 91 31 c0 e8 9f b1 f7 fc 0f 0b 4d 89 f0 48 c7 c1 02 ff ff ff 48 89 ea 48 89 ee 48 c7 c7 e0 c2 76 91 31 c0 e8 7f b1 f7 fc <0f> 0b 4d 89 e8 48 c7 c1 04 ff ff ff 48 89 ea 48 89 ee 48 c7 c7 60
RSP: 0018:fffffe8040b4fbd0 EFLAGS: 00010283
RAX: 00000000000000cc RBX: dffffc0000000000 RCX: ffffffff818c4054
RDX: ffffffff84947381 RSI: ffffffff818d1512 RDI: 0000000000000000
RBP: ffff8880aaa62c00 R08: 0000000000000001 R09: fffffbd008169f32
R10: fffffe8040b4f997 R11: 0000000000000001 R12: a1988d84f24943e4
R13: ffffffffffffff02 R14: ffffffffffffff04 R15: ffff8880aaa62c08
RBX: kasan shadow of 0x0
RCX: __wake_up_klogd.part.0+0x74/0xe0 kernel/printk/printk.c:4554
RDX: __list_del_entry_valid_or_report+0x141/0x200 lib/list_debug.c:58
RSI: vprintk+0x72/0x100 kernel/printk/printk_safe.c:71
RBP: autoslab_size_M_dev_P_net_core_dev_11127_8_1328_8_S_4096_A_64_n_139+0xc00/0x1000 [slab object]
RSP: process kstack fffffe8040b4fbd0+0x7bd0/0x8000 [kworker/u8:7+netns 1804 ]
R09: kasan shadow of process kstack fffffe8040b4f990+0x7990/0x8000 [kworker/u8:7+netns 1804 ]
R10: process kstack fffffe8040b4f997+0x7997/0x8000 [kworker/u8:7+netns 1804 ]
R15: autoslab_size_M_dev_P_net_core_dev_11127_8_1328_8_S_4096_A_64_n_139+0xc08/0x1000 [slab object]
FS: 0000000000000000(0000) GS:ffff888116000000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000748f5372c000 CR3: 0000000015408000 CR4: 00000000003406f0 shadow CR4: 00000000003406f0
Stack:
0000000000000000 ffffffff8a0c35e7 ffffffff8a0c3603 ffff8880aaa62c00
ffff8880aaa62c00 0000000000000004 ffff88811145311c 0000000000000005
0000000000000001 ffff8880aaa62000 fffffe8040b4fd40 ffffffff8a0c360d
Call Trace:
<TASK>
[<ffffffff8a0c360d>] __list_del_entry_valid include/linux/list.h:131 [inline] fffffe8040b4fc28
[<ffffffff8a0c360d>] __list_del_entry include/linux/list.h:248 [inline] fffffe8040b4fc28
[<ffffffff8a0c360d>] list_del include/linux/list.h:262 [inline] fffffe8040b4fc28
[<ffffffff8a0c360d>] gtp_dellink+0x16d/0x360 drivers/net/gtp.c:1557 fffffe8040b4fc28
[<ffffffff8a0d0404>] gtp_net_exit_batch_rtnl+0x124/0x2c0 drivers/net/gtp.c:2495 fffffe8040b4fc88
[<ffffffff8e705b24>] cleanup_net+0x5a4/0xbe0 net/core/net_namespace.c:635 fffffe8040b4fcd0
[<ffffffff81754c97>] process_one_work+0xbd7/0x2160 kernel/workqueue.c:3326 fffffe8040b4fd88
[<ffffffff81757195>] process_scheduled_works kernel/workqueue.c:3407 [inline] fffffe8040b4fec0
[<ffffffff81757195>] worker_thread+0x6b5/0xfa0 kernel/workqueue.c:3488 fffffe8040b4fec0
[<ffffffff817782a0>] kthread+0x360/0x4c0 kernel/kthread.c:397 fffffe8040b4ff78
[<ffffffff814d8594>] ret_from_fork+0x74/0xe0 arch/x86/kernel/process.c:172 fffffe8040b4ffb8
[<ffffffff8110f509>] ret_from_fork_asm+0x29/0xc0 arch/x86/entry/entry_64.S:399 fffffe8040b4ffe8
</TASK>
Modules linked in:
Fixes: eb28fd76c0a0 ("gtp: Destroy device along with udp socket's netns dismantle.")
Reported-by: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250217203705.40342-2-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
During disabling the transcoder in DP 128b/132b mode (both in case of an
MST master transcoder and in case of SST) the transcoder function must
be first disabled without changing any other field in the register (in
particular leaving the DDI port and mode select fields unchanged) and
clearing the DDI port and mode select fields separately, later during
the disabling sequences. Fix the sequence accordingly.
Bspec: 54128, 65448, 68849
Cc: Jani Nikula <jani.nikula@intel.com>
Fixes: 79a6734cd56e ("drm/i915/ddi: disable trancoder port select for 128b/132b SST")
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250217223828.1166093-3-imre.deak@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 2ed653c7b843db0670136330480842d76cb65cd8)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
At the end of a 128b/132b link training sequence, the HW expects the
transcoder training pattern to be set to TPS2 and from that to normal
mode (disabling the training pattern). Transitioning from TPS1 directly
to normal mode leaves the transcoder in a stuck state, resulting in
page-flip timeouts later in the modeset sequence.
Atm, in case of a failure during link training, the transcoder may be
still set to output the TPS1 pattern. Later the transcoder is then set
from TPS1 directly to normal mode in intel_dp_stop_link_train(), leading
to modeset failures later as described above. Fix this by setting the
training patter to TPS2, if the link training failed at any point.
The clue in the specification about the above HW behavior is the
explicit mention that TPS2 must be set after the link training sequence
(and there isn't a similar requirement specified for the 8b/10b link
training), see the Bspec links below.
v2: Add bspec aspect/link to the commit log. (Jani)
Bspec: 54128, 65448, 68849
Cc: stable@vger.kernel.org # v5.18+
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250217223828.1166093-2-imre.deak@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 8b4bbaf8ddc1f68f3ee96a706f65fdb1bcd9d355)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
When compiling without CONFIG_IA32_EMULATION, there can be some errors:
drivers/accel/amdxdna/amdxdna_mailbox.c: In function ‘mailbox_release_msg’:
drivers/accel/amdxdna/amdxdna_mailbox.c:197:2: error: implicit declaration
of function ‘kfree’.
197 | kfree(mb_msg);
| ^~~~~
drivers/accel/amdxdna/amdxdna_mailbox.c: In function ‘xdna_mailbox_send_msg’:
drivers/accel/amdxdna/amdxdna_mailbox.c:418:11: error:implicit declaration
of function ‘kzalloc’.
418 | mb_msg = kzalloc(sizeof(*mb_msg) + pkg_size, GFP_KERNEL);
| ^~~~~~~
Add the missing include.
Fixes: b87f920b9344 ("accel/amdxdna: Support hardware mailbox")
Signed-off-by: Su Hui <suhui@nfschina.com>
Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250211015354.3388171-1-suhui@nfschina.com
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux into mtd/fixes
Fix writes on SST flashes
Commit 18bcb4aa54ea ("mtd: spi-nor: sst: Factor out common write
operation to `sst_nor_write_data()`") introduced a bug where only one
byte of data is written, regardless of the number of bytes requested.
This causes the driver to use the incorrect write size for flashes using
the SST byte programming, and to spit out a warning.
# -----BEGIN PGP SIGNATURE-----
#
# iIoEABYIADIWIQQTlUWNzXGEo3bFmyIR4drqP028CQUCZ7NEiBQccHJhdHl1c2hA
# a2VybmVsLm9yZwAKCRAR4drqP028CTVnAP9krBOLfmlYO94PntaDscgjcehnxbuF
# PEQby8/KlEnX0gEA5K73/0oQIZUnHQ98E6ntAtKoYD5zGNAJaYDpw+66CAU=
# =5xea
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 17 Feb 2025 03:15:36 PM CET
# gpg: using EDDSA key 1395458DCD7184A376C59B2211E1DAEA3F4DBC09
# gpg: issuer "pratyush@kernel.org"
# gpg: Good signature from "Pratyush Yadav <p.yadav@ti.com>" [expired]
# gpg: aka "Pratyush Yadav <me@yadavpratyush.com>" [expired]
# gpg: issuer "pratyush@kernel.org" does not match any User ID
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 805C 3923 2FBE 108C 49E1 663C F650 3556 C11B 1CCD
# Subkey fingerprint: 1395 458D CD71 84A3 76C5 9B22 11E1 DAEA 3F4D BC09
|