Age | Commit message (Collapse) | Author |
|
agp_bridge_data::type is unused (and I cannot find when was used last).
Therefore, remove it.
Found by https://github.com/jirislaby/clang-struct.
Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Cc: David Airlie <airlied@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20240213111511.25187-1-jirislaby@kernel.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.8, take #2
- Avoid dropping the page refcount twice when freeing an unlinked
page-table subtree.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.8, take #1
- Don't source the VFIO Kconfig twice
- Fix protected-mode locking order between kvm and vcpus
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of device-specific fixes. It became a bit bigger than
wished, but all look reasonably small and safe to apply.
- A few Cirrus Logic CS35L56 and CS42L43 driver fixes
- ASoC SOF fixes and workarounds
- Various ASoC Intel fixes
- Lots of HD-, USB-audio and AMD ACP quirks"
* tag 'sound-6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (33 commits)
ALSA: usb-audio: More relaxed check of MIDI jack names
ALSA: hda/realtek: fix mute/micmute LED For HP mt645
ALSA: hda/realtek: cs35l41: Fix order and duplicates in quirks table
ALSA: hda/realtek: cs35l41: Fix device ID / model name
ALSA: hda/realtek: cs35l41: Add internal speaker support for ASUS UM3402 with missing DSD
ASoC: cs35l56: Workaround for ACPI with broken spk-id-gpios property
ALSA: hda: Add Lenovo Legion 7i gen7 sound quirk
ASoC: SOF: IPC3: fix message bounds on ipc ops
ASoC: SOF: ipc4-pcm: Workaround for crashed firmware on system suspend
ASoC: q6dsp: fix event handler prototype
ASoC: SOF: Intel: pci-lnl: Change the topology path to intel/sof-ipc4-tplg
ASoC: SOF: Intel: pci-tgl: Change the default paths and firmware names
ASoC: amd: yc: Fix non-functional mic on Lenovo 82UU
ASoC: rt5645: Add DMI quirk for inverted jack-detect on MeeGoPad T8
ASoC: rt5645: Make LattePanda board DMI match more precise
ASoC: SOF: amd: Fix locking in ACP IRQ handler
ASoC: rt5645: Fix deadlock in rt5645_jack_detect_work()
ASoC: Intel: cht_bsw_rt5645: Cleanup codec_name handling
ASoC: Intel: Boards: Fix NULL pointer deref in BYT/CHT boards
ASoC: cs35l56: Remove default from IRQ1_CFG register
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- add missing stubs for functions that are not built with GPIOLIB
disabled
* tag 'gpio-fixes-for-v6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpiolib: add gpio_device_get_label() stub for !GPIOLIB
gpiolib: add gpio_device_get_base() stub for !GPIOLIB
gpiolib: add gpiod_to_gpio_device() stub for !GPIOLIB
|
|
Pull drm fixes from Dave Airlie:
"Regular weekly fixes, nothing too major, mostly amdgpu, then i915, xe,
msm and nouveau with some scattered bits elsewhere.
crtc:
- fix uninit variable
prime:
- support > 4GB page arrays
buddy:
- fix error handling in allocations
i915:
- fix blankscreen on JSL chromebooks
- stable fix to limit DP sst link rates
xe:
- Fix an out-of-bounds shift.
- Fix the display code thinking xe uses shmem
- Fix a warning about index out-of-bound
- Fix a clang-16 compilation warning
amdgpu:
- PSR fixes
- Suspend/resume fixes
- Link training fix
- Aspect ratio fix
- DCN 3.5 fixes
- VCN 4.x fix
- GFX 11 fix
- Misc display fixes
- Misc small fixes
amdkfd:
- Cache size reporting fix
- SIMD distribution fix
msm:
- GPU:
- dmabuf vmap fix
- a610 UBWC corruption fix (incorrect hbb)
- revert a commit that was making GPU recovery unreliable
- tlb invalidation fix
ivpu:
- suspend/resume fix
nouveau:
- fix scheduler cleanup path
- fix pointless scheduler creation
- fix kvalloc argument order
rockchip:
- vop2 locking fix"
* tag 'drm-fixes-2024-02-16' of git://anongit.freedesktop.org/drm/drm: (38 commits)
drm/amdgpu: Fix implicit assumtion in gfx11 debug flags
drm/amdkfd: update SIMD distribution algo for GFXIP 9.4.2 onwards
drm/amd/display: Increase ips2_eval delay for DCN35
drm/amdgpu/display: Initialize gamma correction mode variable in dcn30_get_gamcor_current()
drm/amdgpu/soc21: update VCN 4 max HEVC encoding resolution
drm/amd/display: fixed integer types and null check locations
drm/amd/display: Fix array-index-out-of-bounds in dcn35_clkmgr
drm/amd/display: Preserve original aspect ratio in create stream
drm/amd/display: Fix possible NULL dereference on device remove/driver unload
Revert "drm/amd/display: increased min_dcfclk_mhz and min_fclk_mhz"
drm/amd/display: Add align done check
Revert "drm/amd: flush any delayed gfxoff on suspend entry"
drm/amd: Stop evicting resources on APUs in suspend
drm/amd/display: Fix possible buffer overflow in 'find_dcfclk_for_voltage()'
drm/amd/display: Fix possible use of uninitialized 'max_chunks_fbc_mode' in 'calculate_bandwidth()'
drm/amd/display: Initialize 'wait_time_microsec' variable in link_dp_training_dpia.c
drm/amd/display: Fix && vs || typos
drm/amdkfd: Fix L2 cache size reporting in GFX9.4.3
drm/amdgpu: make damage clips support configurable
drm/msm: Wire up tlb ops
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Pull lsm fix from Paul Moore:
"One small LSM patch to fix a potential integer overflow in the newly
added lsm_set_self_attr() syscall"
* tag 'lsm-pr-20240215' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
lsm: fix integer overflow in lsm_set_self_attr() syscall
|
|
The bridge type was set to default (Unknown). Set proper bridge type.
With this fixed, debugfs output imx8mp looks proper.
$ cat /sys/kernel/debug/dri/1/encoder-0/bridges
bridge[0]: imx_hdmi_pvi_bridge_funcs [imx8mp_hdmi_pvi]
type: [0] Unknown
OF: /soc@0/bus@32c00000/display-bridge@32fc4000:fsl,imx8mp-hdmi-pvi
ops: [0x0]
bridge[1]: dw_hdmi_bridge_funcs [dw_hdmi]
type: [11] HDMI-A
OF: /soc@0/bus@32c00000/hdmi@32fd8000:fsl,imx8mp-hdmi
ops: [0x7] detect edid hpd
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20240108152514.533414-1-alexander.stein@ew.tq-group.com
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240108152514.533414-1-alexander.stein@ew.tq-group.com
|
|
Remove the duplicate list_splice_tail call when the
total_allocated < size condition is true.
Cc: <stable@vger.kernel.org> # 6.7+
Fixes: 8746c6c9dfa3 ("drm/buddy: Fix alloc_range() error handling code")
Reported-by: Bert Karwatzki <spasswolf@web.de>
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240216100048.4101-1-Arunpravin.PaneerSelvam@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>
|
|
Switch to using the new gem shared memory stats helper
rather than hand rolling it.
Link: https://lore.kernel.org/all/20231207180225.439482-1-alexander.deucher@amd.com/
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
|
|
Switch to using the new gem shared memory stats helper
rather than hand rolling it.
Link: https://lore.kernel.org/all/20231207180225.439482-1-alexander.deucher@amd.com/
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
|
|
Add shared stats. Useful for seeing shared memory.
v2: take dma-buf into account as well
v3: use the new gem helper
Link: https://lore.kernel.org/all/20231207180225.439482-1-alexander.deucher@amd.com/
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Rob Clark <robdclark@gmail.com>
Reviewed-by: Christian König <christian.keonig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
|
|
Show buffers as shared if they are shared via dma-buf as well
(e.g., shared with v4l or some other subsystem).
v2: switch to gem helper
Link: https://lore.kernel.org/all/20231207180225.439482-1-alexander.deucher@amd.com/
Reviewed-by: Rob Clark <robdclark@gmail.com> (v1)
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.keonig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
|
|
Add a helper so that drm drivers can consistently report
shared status via the fdinfo shared memory stats interface.
In addition to handle count, show buffers as shared if they
are shared via dma-buf as well (e.g., shared with v4l or some
other subsystem).
v2: switch to inline function
Link: https://lore.kernel.org/all/20231207180225.439482-1-alexander.deucher@amd.com/
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.keonig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
|
|
Clarify the documentation in preparation for updated
helpers which check the handle count as well as whether
a dma-buf has been attached.
Link: https://lore.kernel.org/all/20231207180225.439482-1-alexander.deucher@amd.com/
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.keonig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
|
|
Add support for the 2340x1080 LCD panel (DJN 9A-3R063-1102B) bundled
with a HX83112A driver IC, as found on the Fairphone 4 smartphone.
Signed-off-by: Luca Weiss <luca.weiss@fairphone.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20240216-fp4-panel-v3-2-a556e4b79640@fairphone.com
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
|
|
Himax HX83112A is a display driver IC used to drive LCD DSI panels.
Describe it and the DJN 9A-3R063-1102B using it.
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Luca Weiss <luca.weiss@fairphone.com>
Link: https://lore.kernel.org/r/20240216-fp4-panel-v3-1-a556e4b79640@fairphone.com
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240216-fp4-panel-v3-1-a556e4b79640@fairphone.com
|
|
The variable space is being initialized with a value that is never read,
it is being re-assigned later on. The initialization is redundant and
can be removed. Also merge two declaration lines together.
Cleans up clang scan build warning:
drivers/gpu/host1x/cdma.c:628:15: warning: Value stored to 'space'
during its initialization is never read [deadcode.DeadStores]
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240215224301.2073845-1-colin.i.king@gmail.com
|
|
If we're redirecting the skb, and haven't called tcf_mirred_forward(),
yet, we need to tell the core to drop the skb by setting the retcode
to SHOT. If we have called tcf_mirred_forward(), however, the skb
is out of our hands and returning SHOT will lead to UaF.
Move the retval override to the error path which actually need it.
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Fixes: e5cf1baf92cb ("act_mirred: use TC_ACT_REINSERT when possible")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The test Davide added in commit ca22da2fbd69 ("act_mirred: use the backlog
for nested calls to mirred ingress") hangs our testing VMs every 10 or so
runs, with the familiar tcp_v4_rcv -> tcp_v4_rcv deadlock reported by
lockdep.
The problem as previously described by Davide (see Link) is that
if we reverse flow of traffic with the redirect (egress -> ingress)
we may reach the same socket which generated the packet. And we may
still be holding its socket lock. The common solution to such deadlocks
is to put the packet in the Rx backlog, rather than run the Rx path
inline. Do that for all egress -> ingress reversals, not just once
we started to nest mirred calls.
In the past there was a concern that the backlog indirection will
lead to loss of error reporting / less accurate stats. But the current
workaround does not seem to address the issue.
Fixes: 53592b364001 ("net/sched: act_mirred: Implement ingress actions")
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Suggested-by: Davide Caratti <dcaratti@redhat.com>
Link: https://lore.kernel.org/netdev/33dc43f587ec1388ba456b4915c75f02a8aae226.1663945716.git.dcaratti@redhat.com/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This driver uses functions that are supplied by the Kconfig symbol
PHYLIB, so select it to ensure that they are built as needed.
When CONFIG_ADIN1110=y and CONFIG_PHYLIB=m, there are multiple build
(linker) errors that are resolved by this Kconfig change:
ld: drivers/net/ethernet/adi/adin1110.o: in function `adin1110_net_open':
drivers/net/ethernet/adi/adin1110.c:933: undefined reference to `phy_start'
ld: drivers/net/ethernet/adi/adin1110.o: in function `adin1110_probe_netdevs':
drivers/net/ethernet/adi/adin1110.c:1603: undefined reference to `get_phy_device'
ld: drivers/net/ethernet/adi/adin1110.c:1609: undefined reference to `phy_connect'
ld: drivers/net/ethernet/adi/adin1110.o: in function `adin1110_disconnect_phy':
drivers/net/ethernet/adi/adin1110.c:1226: undefined reference to `phy_disconnect'
ld: drivers/net/ethernet/adi/adin1110.o: in function `devm_mdiobus_alloc':
include/linux/phy.h:455: undefined reference to `devm_mdiobus_alloc_size'
ld: drivers/net/ethernet/adi/adin1110.o: in function `adin1110_register_mdiobus':
drivers/net/ethernet/adi/adin1110.c:529: undefined reference to `__devm_mdiobus_register'
ld: drivers/net/ethernet/adi/adin1110.o: in function `adin1110_net_stop':
drivers/net/ethernet/adi/adin1110.c:958: undefined reference to `phy_stop'
ld: drivers/net/ethernet/adi/adin1110.o: in function `adin1110_disconnect_phy':
drivers/net/ethernet/adi/adin1110.c:1226: undefined reference to `phy_disconnect'
ld: drivers/net/ethernet/adi/adin1110.o: in function `adin1110_adjust_link':
drivers/net/ethernet/adi/adin1110.c:1077: undefined reference to `phy_print_status'
ld: drivers/net/ethernet/adi/adin1110.o: in function `adin1110_ioctl':
drivers/net/ethernet/adi/adin1110.c:790: undefined reference to `phy_do_ioctl'
ld: drivers/net/ethernet/adi/adin1110.o:(.rodata+0xf60): undefined reference to `phy_ethtool_get_link_ksettings'
ld: drivers/net/ethernet/adi/adin1110.o:(.rodata+0xf68): undefined reference to `phy_ethtool_set_link_ksettings'
Fixes: bc93e19d088b ("net: ethernet: adi: Add ADIN1110 support")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202402070626.eZsfVHG5-lkp@intel.com/
Cc: Lennart Franzen <lennart@lfdomain.com>
Cc: Alexandru Tachici <alexandru.tachici@analog.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: netdev@vger.kernel.org
Reviewed-by: Nuno Sa <nuno.sa@analog.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
syzkaller reported a warning [0] in inet_csk_destroy_sock() with no
repro.
WARN_ON(inet_sk(sk)->inet_num && !inet_csk(sk)->icsk_bind_hash);
However, the syzkaller's log hinted that connect() failed just before
the warning due to FAULT_INJECTION. [1]
When connect() is called for an unbound socket, we search for an
available ephemeral port. If a bhash bucket exists for the port, we
call __inet_check_established() or __inet6_check_established() to check
if the bucket is reusable.
If reusable, we add the socket into ehash and set inet_sk(sk)->inet_num.
Later, we look up the corresponding bhash2 bucket and try to allocate
it if it does not exist.
Although it rarely occurs in real use, if the allocation fails, we must
revert the changes by check_established(). Otherwise, an unconnected
socket could illegally occupy an ehash entry.
Note that we do not put tw back into ehash because sk might have
already responded to a packet for tw and it would be better to free
tw earlier under such memory presure.
[0]:
WARNING: CPU: 0 PID: 350830 at net/ipv4/inet_connection_sock.c:1193 inet_csk_destroy_sock (net/ipv4/inet_connection_sock.c:1193)
Modules linked in:
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:inet_csk_destroy_sock (net/ipv4/inet_connection_sock.c:1193)
Code: 41 5c 41 5d 41 5e e9 2d 4a 3d fd e8 28 4a 3d fd 48 89 ef e8 f0 cd 7d ff 5b 5d 41 5c 41 5d 41 5e e9 13 4a 3d fd e8 0e 4a 3d fd <0f> 0b e9 61 fe ff ff e8 02 4a 3d fd 4c 89 e7 be 03 00 00 00 e8 05
RSP: 0018:ffffc9000b21fd38 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000009e78 RCX: ffffffff840bae40
RDX: ffff88806e46c600 RSI: ffffffff840bb012 RDI: ffff88811755cca8
RBP: ffff88811755c880 R08: 0000000000000003 R09: 0000000000000000
R10: 0000000000009e78 R11: 0000000000000000 R12: ffff88811755c8e0
R13: ffff88811755c892 R14: ffff88811755c918 R15: 0000000000000000
FS: 00007f03e5243800(0000) GS:ffff88811ae00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b32f21000 CR3: 0000000112ffe001 CR4: 0000000000770ef0
PKRU: 55555554
Call Trace:
<TASK>
? inet_csk_destroy_sock (net/ipv4/inet_connection_sock.c:1193)
dccp_close (net/dccp/proto.c:1078)
inet_release (net/ipv4/af_inet.c:434)
__sock_release (net/socket.c:660)
sock_close (net/socket.c:1423)
__fput (fs/file_table.c:377)
__fput_sync (fs/file_table.c:462)
__x64_sys_close (fs/open.c:1557 fs/open.c:1539 fs/open.c:1539)
do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
RIP: 0033:0x7f03e53852bb
Code: 03 00 00 00 0f 05 48 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 43 c9 f5 ff 8b 7c 24 0c 41 89 c0 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 89 44 24 0c e8 a1 c9 f5 ff 8b 44
RSP: 002b:00000000005dfba0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f03e53852bb
RDX: 0000000000000002 RSI: 0000000000000002 RDI: 0000000000000003
RBP: 0000000000000000 R08: 0000000000000000 R09: 000000000000167c
R10: 0000000008a79680 R11: 0000000000000293 R12: 00007f03e4e43000
R13: 00007f03e4e43170 R14: 00007f03e4e43178 R15: 00007f03e4e43170
</TASK>
[1]:
FAULT_INJECTION: forcing a failure.
name failslab, interval 1, probability 0, space 0, times 0
CPU: 0 PID: 350833 Comm: syz-executor.1 Not tainted 6.7.0-12272-g2121c43f88f5 #9
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1))
should_fail_ex (lib/fault-inject.c:52 lib/fault-inject.c:153)
should_failslab (mm/slub.c:3748)
kmem_cache_alloc (mm/slub.c:3763 mm/slub.c:3842 mm/slub.c:3867)
inet_bind2_bucket_create (net/ipv4/inet_hashtables.c:135)
__inet_hash_connect (net/ipv4/inet_hashtables.c:1100)
dccp_v4_connect (net/dccp/ipv4.c:116)
__inet_stream_connect (net/ipv4/af_inet.c:676)
inet_stream_connect (net/ipv4/af_inet.c:747)
__sys_connect_file (net/socket.c:2048 (discriminator 2))
__sys_connect (net/socket.c:2065)
__x64_sys_connect (net/socket.c:2072)
do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
RIP: 0033:0x7f03e5284e5d
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 9f 1b 00 f7 d8 64 89 01 48
RSP: 002b:00007f03e4641cc8 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 00000000004bbf80 RCX: 00007f03e5284e5d
RDX: 0000000000000010 RSI: 0000000020000000 RDI: 0000000000000003
RBP: 00000000004bbf80 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 000000000000000b R14: 00007f03e52e5530 R15: 0000000000000000
</TASK>
Reported-by: syzkaller <syzkaller@googlegroups.com>
Fixes: 28044fc1d495 ("net: Add a bhash2 table hashed by port and address")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Tobias Waldekranz says:
====================
net: bridge: switchdev: Ensure MDB events are delivered exactly once
When a device is attached to a bridge, drivers will request a replay
of objects that were created before the device joined the bridge, that
are still of interest to the joining port. Typical examples include
FDB entries and MDB memberships on other ports ("foreign interfaces")
or on the bridge itself.
Conversely when a device is detached, the bridge will synthesize
deletion events for all those objects that are still live, but no
longer applicable to the device in question.
This series eliminates two races related to the synching and
unsynching phases of a bridge's MDB with a joining or leaving device,
that would cause notifications of such objects to be either delivered
twice (1/2), or not at all (2/2).
A similar race to the one solved by 1/2 still remains for the
FDB. This is much harder to solve, due to the lockless operation of
the FDB's rhashtable, and is therefore knowingly left out of this
series.
v1 -> v2:
- Squash the previously separate addition of
switchdev_port_obj_act_is_deferred into first consumer.
- Use ether_addr_equal to compare MAC addresses.
- Document switchdev_port_obj_act_is_deferred (renamed from
switchdev_port_obj_is_deferred in v1, to indicate that we also match
on the action).
- Delay allocations of MDB objects until we know they're needed.
- Use non-RCU version of the hash list iterator, now that the MDB is
not scanned while holding the RCU read lock.
- Add Fixes tag to commit message
v2 -> v3:
- Fix unlocking in error paths
- Access RCU protected port list via mlock_dereference, since MDB is
guaranteed to remain constant for the duration of the scan.
v3 -> v4:
- Limit the search for exiting deferred events in 1/2 to only apply to
additions, since the problem does not exist in the deletion case.
- Add 2/2, to plug a related race when unoffloading an indirectly
associated device.
v4 -> v5:
- Fix grammatical errors in kerneldoc of
switchdev_port_obj_act_is_deferred
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When unoffloading a device, it is important to ensure that all
relevant deferred events are delivered to it before it disassociates
itself from the bridge.
Before this change, this was true for the normal case when a device
maps 1:1 to a net_bridge_port, i.e.
br0
/
swp0
When swp0 leaves br0, the call to switchdev_deferred_process() in
del_nbp() makes sure to process any outstanding events while the
device is still associated with the bridge.
In the case when the association is indirect though, i.e. when the
device is attached to the bridge via an intermediate device, like a
LAG...
br0
/
lag0
/
swp0
...then detaching swp0 from lag0 does not cause any net_bridge_port to
be deleted, so there was no guarantee that all events had been
processed before the device disassociated itself from the bridge.
Fix this by always synchronously processing all deferred events before
signaling completion of unoffloading back to the driver.
Fixes: 4e51bf44a03a ("net: bridge: move the switchdev object replay helpers to "push" mode")
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Before this change, generation of the list of MDB events to replay
would race against the creation of new group memberships, either from
the IGMP/MLD snooping logic or from user configuration.
While new memberships are immediately visible to walkers of
br->mdb_list, the notification of their existence to switchdev event
subscribers is deferred until a later point in time. So if a replay
list was generated during a time that overlapped with such a window,
it would also contain a replay of the not-yet-delivered event.
The driver would thus receive two copies of what the bridge internally
considered to be one single event. On destruction of the bridge, only
a single membership deletion event was therefore sent. As a
consequence of this, drivers which reference count memberships (at
least DSA), would be left with orphan groups in their hardware
database when the bridge was destroyed.
This is only an issue when replaying additions. While deletion events
may still be pending on the deferred queue, they will already have
been removed from br->mdb_list, so no duplicates can be generated in
that scenario.
To a user this meant that old group memberships, from a bridge in
which a port was previously attached, could be reanimated (in
hardware) when the port joined a new bridge, without the new bridge's
knowledge.
For example, on an mv88e6xxx system, create a snooping bridge and
immediately add a port to it:
root@infix-06-0b-00:~$ ip link add dev br0 up type bridge mcast_snooping 1 && \
> ip link set dev x3 up master br0
And then destroy the bridge:
root@infix-06-0b-00:~$ ip link del dev br0
root@infix-06-0b-00:~$ mvls atu
ADDRESS FID STATE Q F 0 1 2 3 4 5 6 7 8 9 a
DEV:0 Marvell 88E6393X
33:33:00:00:00:6a 1 static - - 0 . . . . . . . . . .
33:33:ff:87:e4:3f 1 static - - 0 . . . . . . . . . .
ff:ff:ff:ff:ff:ff 1 static - - 0 1 2 3 4 5 6 7 8 9 a
root@infix-06-0b-00:~$
The two IPv6 groups remain in the hardware database because the
port (x3) is notified of the host's membership twice: once via the
original event and once via a replay. Since only a single delete
notification is sent, the count remains at 1 when the bridge is
destroyed.
Then add the same port (or another port belonging to the same hardware
domain) to a new bridge, this time with snooping disabled:
root@infix-06-0b-00:~$ ip link add dev br1 up type bridge mcast_snooping 0 && \
> ip link set dev x3 up master br1
All multicast, including the two IPv6 groups from br0, should now be
flooded, according to the policy of br1. But instead the old
memberships are still active in the hardware database, causing the
switch to only forward traffic to those groups towards the CPU (port
0).
Eliminate the race in two steps:
1. Grab the write-side lock of the MDB while generating the replay
list.
This prevents new memberships from showing up while we are generating
the replay list. But it leaves the scenario in which a deferred event
was already generated, but not delivered, before we grabbed the
lock. Therefore:
2. Make sure that no deferred version of a replay event is already
enqueued to the switchdev deferred queue, before adding it to the
replay list, when replaying additions.
Fixes: 4f2673b3a2b6 ("net: bridge: add helper to replay port and host-joined mdb entries")
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
iucv_path_table is a dynamically allocated array of pointers to
struct iucv_path items. Yet, its size is calculated as if it was
an array of struct iucv_path items.
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
https://gitlab.freedesktop.org/drm/msm into drm-fixes
Fixes for v6.8-rc5
GPU:
- dmabuf vmap fix
- a610 UBWC corruption fix (incorrect hbb)
- revert a commit that was making GPU recovery unreliable
- tlb invalidation fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGszDSiw66+a=ttBr-hat+zrcBtfc_cZ4LQqXu89DJ0UeQ@mail.gmail.com
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.8-2024-02-15-2:
amdgpu:
- PSR fixes
- Suspend/resume fixes
- Link training fix
- Aspect ratio fix
- DCN 3.5 fixes
- VCN 4.x fix
- GFX 11 fix
- Misc display fixes
- Misc small fixes
amdkfd:
- Cache size reporting fix
- SIMD distribution fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240215192452.11805-1-alexander.deucher@amd.com
|
|
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Driver Changes:
- Fix an out-of-bounds shift.
- Fix the display code thinking xe uses shmem
- Fix a warning about index out-of-bound
- Fix a clang-16 compilation warning
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Zc4GpcrbFVqdK9Ws@fedora
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
Fix for #10172: Blank screen on JSL Chromebooks. Stable fix to limit DP SST link rate to <=8.1Gbps.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Zc37W27F5OvoeSkG@jlahtine-mobl.ger.corp.intel.com
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
A suspend/resume error fix for ivpu, a couple of scheduler fixes for
nouveau, a patch to support large page arrays in prime, a uninitialized
variable fix in crtc, a locking fix in rockchip/vop2 and a buddy
allocator error reporting fix.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/b4ffqzigtfh6cgzdpwuk6jlrv3dnk4hu6etiizgvibysqgtl2p@42n2gdfdd5eu
|
|
The conversion to netfs in the 6.3 kernel caused a regression when
maximum write size is set by the server to an unexpected value which is
not a multiple of 4096 (similarly if the user overrides the maximum
write size by setting mount parm "wsize", but sets it to a value that
is not a multiple of 4096). When negotiated write size is not a
multiple of 4096 the netfs code can skip the end of the final
page when doing large sequential writes, causing data corruption.
This section of code is being rewritten/removed due to a large
netfs change, but until that point (ie for the 6.3 kernel until now)
we can not support non-standard maximum write sizes.
Add a warning if a user specifies a wsize on mount that is not
a multiple of 4096 (and round down), also add a change where we
round down the maximum write size if the server negotiates a value
that is not a multiple of 4096 (we also have to check to make sure that
we do not round it down to zero).
Reported-by: R. Diez" <rdiez-2006@rd10.de>
Fixes: d08089f649a0 ("cifs: Change the I/O paths to use an iterator rather than a page list")
Suggested-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Acked-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Tested-by: Matthew Ruffell <matthew.ruffell@canonical.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Cc: stable@vger.kernel.org # v6.3+
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
|
|
Hou Tao says:
====================
Fix the read of vsyscall page through bpf
From: Hou Tao <houtao1@huawei.com>
Hi,
As reported by syzboot [1] and [2], when trying to read vsyscall page
by using bpf_probe_read_kernel() or bpf_probe_read(), oops may happen.
Thomas Gleixner had proposed a test patch [3], but it seems that no
formal patch is posted after about one month [4], so I post it instead
and add an Originally-by tag in patch #2.
Patch #1 makes is_vsyscall_vaddr() being a common helper. Patch #2 fixes
the problem by disallowing vsyscall page read for
copy_from_kernel_nofault(). Patch #3 adds one test case to ensure the
read of vsyscall page through bpf is rejected. Please see individual
patches for more details.
Comments are always welcome.
[1]: https://lore.kernel.org/bpf/CAG48ez06TZft=ATH1qh2c5mpS5BT8UakwNkzi6nvK5_djC-4Nw@mail.gmail.com/
[2]: https://lore.kernel.org/bpf/CABOYnLynjBoFZOf3Z4BhaZkc5hx_kHfsjiW+UWLoB=w33LvScw@mail.gmail.com/
[3]: https://lore.kernel.org/bpf/87r0jwquhv.ffs@tglx/
[4]: https://lore.kernel.org/bpf/e24b125c-8ff4-9031-6c53-67ff2e01f316@huaweicloud.com/
Change Log:
v3:
* rephrase commit message for patch #1 & #2 (Sohil)
* reword comments in copy_from_kernel_nofault_allowed() (Sohil)
* add Rvb tag for patch #1 and Acked-by tag for patch #3 (Sohil, Yonghong)
v2: https://lore.kernel.org/bpf/20240126115423.3943360-1-houtao@huaweicloud.com/
* move is_vsyscall_vaddr to asm/vsyscall.h instead (Sohil)
* elaborate on the reason for disallowing of vsyscall page read in
copy_from_kernel_nofault_allowed() (Sohil)
* update the commit message of patch #2 to more clearly explain how
the oops occurs. (Sohil)
* update the commit message of patch #3 to explain the expected return
values of various bpf helpers (Yonghong)
v1: https://lore.kernel.org/bpf/20240119073019.1528573-1-houtao@huaweicloud.com/
====================
Link: https://lore.kernel.org/r/20240202103935.3154011-1-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Under x86-64, when using bpf_probe_read_kernel{_str}() or
bpf_probe_read{_str}() to read vsyscall page, the read may trigger oops,
so add one test case to ensure that the problem is fixed. Beside those
four bpf helpers mentioned above, testing the read of vsyscall page by
using bpf_probe_read_user{_str} and bpf_copy_from_user{_task}() as well.
The test case passes the address of vsyscall page to these six helpers
and checks whether the returned values are expected:
1) For bpf_probe_read_kernel{_str}()/bpf_probe_read{_str}(), the
expected return value is -ERANGE as shown below:
bpf_probe_read_kernel_common
copy_from_kernel_nofault
// false, return -ERANGE
copy_from_kernel_nofault_allowed
2) For bpf_probe_read_user{_str}(), the expected return value is -EFAULT
as show below:
bpf_probe_read_user_common
copy_from_user_nofault
// false, return -EFAULT
__access_ok
3) For bpf_copy_from_user(), the expected return value is -EFAULT:
// return -EFAULT
bpf_copy_from_user
copy_from_user
_copy_from_user
// return false
access_ok
4) For bpf_copy_from_user_task(), the expected return value is -EFAULT:
// return -EFAULT
bpf_copy_from_user_task
access_process_vm
// return 0
vma_lookup()
// return 0
expand_stack()
The occurrence of oops depends on the availability of CPU SMAP [1]
feature and there are three possible configurations of vsyscall page in
the boot cmd-line: vsyscall={xonly|none|emulate}, so there are a total
of six possible combinations. Under all these combinations, the test
case runs successfully.
[1]: https://en.wikipedia.org/wiki/Supervisor_Mode_Access_Prevention
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20240202103935.3154011-4-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
When trying to use copy_from_kernel_nofault() to read vsyscall page
through a bpf program, the following oops was reported:
BUG: unable to handle page fault for address: ffffffffff600000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 3231067 P4D 3231067 PUD 3233067 PMD 3235067 PTE 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 1 PID: 20390 Comm: test_progs ...... 6.7.0+ #58
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) ......
RIP: 0010:copy_from_kernel_nofault+0x6f/0x110
......
Call Trace:
<TASK>
? copy_from_kernel_nofault+0x6f/0x110
bpf_probe_read_kernel+0x1d/0x50
bpf_prog_2061065e56845f08_do_probe_read+0x51/0x8d
trace_call_bpf+0xc5/0x1c0
perf_call_bpf_enter.isra.0+0x69/0xb0
perf_syscall_enter+0x13e/0x200
syscall_trace_enter+0x188/0x1c0
do_syscall_64+0xb5/0xe0
entry_SYSCALL_64_after_hwframe+0x6e/0x76
</TASK>
......
---[ end trace 0000000000000000 ]---
The oops is triggered when:
1) A bpf program uses bpf_probe_read_kernel() to read from the vsyscall
page and invokes copy_from_kernel_nofault() which in turn calls
__get_user_asm().
2) Because the vsyscall page address is not readable from kernel space,
a page fault exception is triggered accordingly.
3) handle_page_fault() considers the vsyscall page address as a user
space address instead of a kernel space address. This results in the
fix-up setup by bpf not being applied and a page_fault_oops() is invoked
due to SMAP.
Considering handle_page_fault() has already considered the vsyscall page
address as a userspace address, fix the problem by disallowing vsyscall
page read for copy_from_kernel_nofault().
Originally-by: Thomas Gleixner <tglx@linutronix.de>
Reported-by: syzbot+72aa0161922eba61b50e@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/bpf/CAG48ez06TZft=ATH1qh2c5mpS5BT8UakwNkzi6nvK5_djC-4Nw@mail.gmail.com
Reported-by: xingwei lee <xrivendell7@gmail.com>
Closes: https://lore.kernel.org/bpf/CABOYnLynjBoFZOf3Z4BhaZkc5hx_kHfsjiW+UWLoB=w33LvScw@mail.gmail.com
Signed-off-by: Hou Tao <houtao1@huawei.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240202103935.3154011-3-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Move is_vsyscall_vaddr() into asm/vsyscall.h to make it available for
copy_from_kernel_nofault_allowed() in arch/x86/mm/maccess.c.
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/20240202103935.3154011-2-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v6.9:
UAPI Changes:
Cross-subsystem Changes:
arch:
- powerpc/ps3: select CONFIG_VIDEO
Core Changes:
ci:
- msm: fix apq8016 runner
display:
- use newer DRM print helpers
documentation:
- fix typos
print:
- add device-specific error and debug printers
sysfb:
- set Linux parent device for firmware framebuffer
tests:
- mm: use newer DRM print helpers
Driver Changes:
bridge:
- switch to ->read_edid callback throughout the bridge
drivers
- remove old ->get_edid callback
i915:
- use newer DRM print helpers
lima:
- improve stability by fixes to error handling and recovery
mediathek:
- switch to ->read_edid callback
msm:
- switch to ->read_edid callback
omap:
- switch to ->read_edid callback
panel:
- add Powkiddy RGB10MAX3 plus DT bindings
- st7703: support panel rotation plus DT bindings
rockchip:
- DT bindings: remove port, add power-domains
xe:
- use newer DRM print helpers
xlnx:
- switch to ->read_edid callback
Signed-off-by: Dave Airlie <airlied@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmXOD/oACgkQaA3BHVML
# eiMWMAgArTVXF4UQ+FUxYZB5QTm2veYIpilvwmzaQLNxsM9SsWpzwMIVAi+xf93g
# uqUqkl6QvZ9pJg6bxuXRNcJw/GObIO4x6tn+LkbccczgHiHwvn6ydNdUoMx8ulne
# EsGC0z8bb5Gpwh9b/pnBul2AoIE7PHAJltgH271/O2xnhFMUbchQ0ckHvWnn8/GA
# Nef145ySX4gkYtY8u2TRr4r6Bkp7Tpiyv6ipU7Cpu7KqyveTDMx3c9r5FaiHnJT/
# Hx/5s87q0Bx2m+iNjlBLJzYjF2UWth+pbfiu3xwyWOE7hdkPLwCQ5mqHWcFFqxfb
# Vuj9jP+Vb68L7EvGpq2LArLdhZjHIQ==
# =SsjX
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 15 Feb 2024 23:22:02 AEST
# gpg: using RSA key 7217FBAC8CE9CF6344A168E5680DC11D530B7A23
# gpg: Can't check signature: No public key
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240215132610.GA1464@localhost.localdomain
|
|
Write error handling is racy and can sometime lead to the error recovery
path wrongly changing the inode size of a sequential zone file to an
incorrect value which results in garbage data being readable at the end
of a file. There are 2 problems:
1) zonefs_file_dio_write() updates a zone file write pointer offset
after issuing a direct IO with iomap_dio_rw(). This update is done
only if the IO succeed for synchronous direct writes. However, for
asynchronous direct writes, the update is done without waiting for
the IO completion so that the next asynchronous IO can be
immediately issued. However, if an asynchronous IO completes with a
failure right before the i_truncate_mutex lock protecting the update,
the update may change the value of the inode write pointer offset
that was corrected by the error path (zonefs_io_error() function).
2) zonefs_io_error() is called when a read or write error occurs. This
function executes a report zone operation using the callback function
zonefs_io_error_cb(), which does all the error recovery handling
based on the current zone condition, write pointer position and
according to the mount options being used. However, depending on the
zoned device being used, a report zone callback may be executed in a
context that is different from the context of __zonefs_io_error(). As
a result, zonefs_io_error_cb() may be executed without the inode
truncate mutex lock held, which can lead to invalid error processing.
Fix both problems as follows:
- Problem 1: Perform the inode write pointer offset update before a
direct write is issued with iomap_dio_rw(). This is safe to do as
partial direct writes are not supported (IOMAP_DIO_PARTIAL is not
set) and any failed IO will trigger the execution of zonefs_io_error()
which will correct the inode write pointer offset to reflect the
current state of the one on the device.
- Problem 2: Change zonefs_io_error_cb() into zonefs_handle_io_error()
and call this function directly from __zonefs_io_error() after
obtaining the zone information using blkdev_report_zones() with a
simple callback function that copies to a local stack variable the
struct blk_zone obtained from the device. This ensures that error
handling is performed holding the inode truncate mutex.
This change also simplifies error handling for conventional zone files
by bypassing the execution of report zones entirely. This is safe to
do because the condition of conventional zones cannot be read-only or
offline and conventional zone files are always fully mapped with a
constant file size.
Reported-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Fixes: 8dcc1a9d90c1 ("fs: New zonefs file system")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-next
UAPI Changes:
- Add GuC submission interface version query (Tvrtko Ursulin)
Driver Changes:
Fixes/improvements/new stuff:
- Atomically invalidate userptr on mmu-notifier (Jonathan Cavitt)
- Update handling of MMIO triggered reports (Umesh Nerlige Ramappa)
- Don't make assumptions about intel_wakeref_t type (Jani Nikula)
- Add workaround 14019877138 [xelpg] (Tejas Upadhyay)
- Allow for very slow HuC loading [huc] (John Harrison)
- Flush context destruction worker at suspend [guc] (Alan Previn)
- Close deregister-context race against CT-loss [guc] (Alan Previn)
- Avoid circular locking issue on busyness flush [guc] (John Harrison)
- Use rc6.supported flag from intel_gt for rc6_enable sysfs (Juan Escamilla)
- Reflect the true and current status of rc6_enable (Juan Escamilla)
- Wake GT before sending H2G message [mtl] (Vinay Belgaumkar)
- Restart the heartbeat timer when forcing a pulse (John Harrison)
Future platform enablement:
- Extend driver code of Xe_LPG to Xe_LPG+ [xelpg] (Harish Chegondi)
- Extend some workarounds/tuning to gfx version 12.74 [xelpg] (Matt Roper)
Miscellaneous:
- Reconcile Excess struct member kernel-doc warnings (Randy Dunlap)
- Change wa and EU_PERF_CNTL registers to MCR type [guc] (Shuicheng Lin)
- Add flex arrays to struct i915_syncmap (Erick Archer)
- Increasing the sleep time for live_rc6_manual [selftests] (Anirban Sk)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Zc3iIVsiAwo+bu10@tursulin-desk
|
|
md_start_sync() will suspend the array if there are spares that can be
added or removed from conf, however, if reshape is still in progress,
this won't happen at all or data will be corrupted(remove_and_add_spares
won't be called from md_choose_sync_action for reshape), hence there is
no need to suspend the array if reshape is not done yet.
Meanwhile, there is a potential deadlock for raid456:
1) reshape is interrupted;
2) set one of the disk WantReplacement, and add a new disk to the array,
however, recovery won't start until the reshape is finished;
3) then issue an IO across reshpae position, this IO will wait for
reshape to make progress;
4) continue to reshape, then md_start_sync() found there is a spare disk
that can be added to conf, mddev_suspend() is called;
Step 4 and step 3 is waiting for each other, deadlock triggered. Noted
this problem is found by code review, and it's not reporduced yet.
Fix this porblem by don't suspend the array for interrupted reshape,
this is safe because conf won't be changed until reshape is done.
Fixes: bc08041b32ab ("md: suspend array in md_start_sync() if array need reconfiguration")
Cc: stable@vger.kernel.org # v6.7+
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240201092559.910982-6-yukuai1@huaweicloud.com
|
|
Currently, if reshape is interrupted, then reassemble the array will
register sync_thread directly from pers->run(), in this case
'MD_RECOVERY_RUNNING' is set directly, however, there is no guarantee
that md_do_sync() will be executed, hence stop_sync_thread() will hang
because 'MD_RECOVERY_RUNNING' can't be cleared.
Last patch make sure that md_do_sync() will set MD_RECOVERY_DONE,
however, following hang can still be triggered by dm-raid test
shell/lvconvert-raid-reshape.sh occasionally:
[root@fedora ~]# cat /proc/1982/stack
[<0>] stop_sync_thread+0x1ab/0x270 [md_mod]
[<0>] md_frozen_sync_thread+0x5c/0xa0 [md_mod]
[<0>] raid_presuspend+0x1e/0x70 [dm_raid]
[<0>] dm_table_presuspend_targets+0x40/0xb0 [dm_mod]
[<0>] __dm_destroy+0x2a5/0x310 [dm_mod]
[<0>] dm_destroy+0x16/0x30 [dm_mod]
[<0>] dev_remove+0x165/0x290 [dm_mod]
[<0>] ctl_ioctl+0x4bb/0x7b0 [dm_mod]
[<0>] dm_ctl_ioctl+0x11/0x20 [dm_mod]
[<0>] vfs_ioctl+0x21/0x60
[<0>] __x64_sys_ioctl+0xb9/0xe0
[<0>] do_syscall_64+0xc6/0x230
[<0>] entry_SYSCALL_64_after_hwframe+0x6c/0x74
Meanwhile mddev->recovery is:
MD_RECOVERY_RUNNING |
MD_RECOVERY_INTR |
MD_RECOVERY_RESHAPE |
MD_RECOVERY_FROZEN
Fix this problem by remove the code to register sync_thread directly
from raid10 and raid5. And let md_check_recovery() to register
sync_thread.
Fixes: f67055780caa ("[PATCH] md: Checkpoint and allow restart of raid5 reshape")
Fixes: f52f5c71f3d4 ("md: fix stopping sync thread")
Cc: stable@vger.kernel.org # v6.7+
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240201092559.910982-5-yukuai1@huaweicloud.com
|
|
stop_sync_thread() will interrupt md_do_sync(), and md_do_sync() must
set MD_RECOVERY_DONE, so that follow up md_check_recovery() will
unregister sync_thread, clear MD_RECOVERY_RUNNING and wake up
stop_sync_thread().
If MD_RECOVERY_WAIT is set or the array is read-only, md_do_sync() will
return without setting MD_RECOVERY_DONE, and after commit f52f5c71f3d4
("md: fix stopping sync thread"), dm-raid switch from
md_reap_sync_thread() to stop_sync_thread() to unregister sync_thread
from md_stop() and md_stop_writes(), causing the test
shell/lvconvert-raid-reshape.sh hang.
We shouldn't switch back to md_reap_sync_thread() because it's
problematic in the first place. Fix the problem by making sure
md_do_sync() will set MD_RECOVERY_DONE.
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Closes: https://lore.kernel.org/all/ece2b06f-d647-6613-a534-ff4c9bec1142@redhat.com/
Fixes: d5d885fd514f ("md: introduce new personality funciton start()")
Fixes: 5fd6c1dce06e ("[PATCH] md: allow checkpoint of recovery with version-1 superblock")
Fixes: f52f5c71f3d4 ("md: fix stopping sync thread")
Cc: stable@vger.kernel.org # v6.7+
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240201092559.910982-4-yukuai1@huaweicloud.com
|
|
Usually if the array is not read-write, md_check_recovery() won't
register new sync_thread in the first place. And if the array is
read-write and sync_thread is registered, md_set_readonly() will
unregister sync_thread before setting the array read-only. md/raid
follow this behavior hence there is no problem.
After commit f52f5c71f3d4 ("md: fix stopping sync thread"), following
hang can be triggered by test shell/integrity-caching.sh:
1) array is read-only. dm-raid update super block:
rs_update_sbs
ro = mddev->ro
mddev->ro = 0
-> set array read-write
md_update_sb
2) register new sync thread concurrently.
3) dm-raid set array back to read-only:
rs_update_sbs
mddev->ro = ro
4) stop the array:
raid_dtr
md_stop
stop_sync_thread
set_bit(MD_RECOVERY_INTR, &mddev->recovery);
md_wakeup_thread_directly(mddev->sync_thread);
wait_event(..., !test_bit(MD_RECOVERY_RUNNING, &mddev->recovery))
5) sync thread done:
md_do_sync
set_bit(MD_RECOVERY_DONE, &mddev->recovery);
md_wakeup_thread(mddev->thread);
6) daemon thread can't unregister sync thread:
md_check_recovery
if (!md_is_rdwr(mddev) &&
!test_bit(MD_RECOVERY_NEEDED, &mddev->recovery))
return;
-> -> MD_RECOVERY_RUNNING can't be cleared, hence step 4 hang;
The root cause is that dm-raid manipulate 'mddev->ro' by itself,
however, dm-raid really should stop sync thread before setting the
array read-only. Unfortunately, I need to read more code before I
can refacter the handler of 'mddev->ro' in dm-raid, hence let's fix
the problem the easy way for now to prevent dm-raid regression.
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Closes: https://lore.kernel.org/all/9801e40-8ac7-e225-6a71-309dcf9dc9aa@redhat.com/
Fixes: ecbfb9f118bc ("dm raid: add raid level takeover support")
Fixes: f52f5c71f3d4 ("md: fix stopping sync thread")
Cc: stable@vger.kernel.org # v6.7+
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240201092559.910982-3-yukuai1@huaweicloud.com
|
|
mddev_suspend() never stop sync_thread, hence it doesn't make sense to
ignore suspended array in md_check_recovery(), which might cause
sync_thread can't be unregistered.
After commit f52f5c71f3d4 ("md: fix stopping sync thread"), following
hang can be triggered by test shell/integrity-caching.sh:
1) suspend the array:
raid_postsuspend
mddev_suspend
2) stop the array:
raid_dtr
md_stop
__md_stop_writes
stop_sync_thread
set_bit(MD_RECOVERY_INTR, &mddev->recovery);
md_wakeup_thread_directly(mddev->sync_thread);
wait_event(..., !test_bit(MD_RECOVERY_RUNNING, &mddev->recovery))
3) sync thread done:
md_do_sync
set_bit(MD_RECOVERY_DONE, &mddev->recovery);
md_wakeup_thread(mddev->thread);
4) daemon thread can't unregister sync thread:
md_check_recovery
if (mddev->suspended)
return; -> return directly
md_read_sync_thread
clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
-> MD_RECOVERY_RUNNING can't be cleared, hence step 2 hang;
This problem is not just related to dm-raid, fix it by ignoring
suspended array in md_check_recovery(). And follow up patches will
improve dm-raid better to frozen sync thread during suspend.
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Closes: https://lore.kernel.org/all/8fb335e-6d2c-dbb5-d7-ded8db5145a@redhat.com/
Fixes: 68866e425be2 ("MD: no sync IO while suspended")
Fixes: f52f5c71f3d4 ("md: fix stopping sync thread")
Cc: stable@vger.kernel.org # v6.7+
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20240201092559.910982-2-yukuai1@huaweicloud.com
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-next
drm/i915 feature pull for v6.9:
Features and functionality:
- Early transport for panel replay and PSR (Jouni)
- New ARL PCI IDs (Matt)
- DP TPS4 PHY test pattern support (Khaled)
Refactoring and cleanups:
- Unify and improve VSC SDP for PSR and non-PSR cases (Jouni)
- Refactor memory regions and improve debug logging (Ville)
- Rework global state serialization (Ville)
- Remove unused CDCLK divider fields (Gustavo)
- Unify HDCP connector logging format (Jani)
- Use display instead of graphics version in display code (Jani)
- Move VBT and opregion debugfs next to the implementation (Jani)
- Abstract opregion interface, use opaque type (Jani)
Fixes:
- Fix MTL stolen memory access (Ville)
- Fix initial display plane readout for MTL (Ville)
- Fix HPD handling during driver init/shutdown (Imre)
- Cursor vblank evasion fixes (Ville)
- Various VSC SDP fixes (Jouni)
- Allow PSR mode changes without full modeset (Jouni)
- Fix CDCLK sanitization on module load for Xe2_LPD (Gustavo)
- Fix the max DSC bpc supported by the source (Ankit)
- Add missing LNL ALPM AUX wake configuration (Jouni)
- Cx0 PHY state readout and verify fixes (Mika)
- Fix PSR (panel replay) debugfs for MST connectors (Imre)
- Fail HDCP repeater authentication if Type1 device not present (Suraj)
- Ratelimit debug logging in vm_fault_ttm (Nirmoy)
- Use a fake PCH for MTL because south display is not on the PCH (Haridhar)
- Disable DSB for Xe driver for now (José)
- Fix some LNL display register changes (Lucas)
- Fix build on ChromeOS (Paz Zcharya)
- Preserve current shared DPLL for fastsets on Type-C ports (Ville)
- Fix state checker warnings for MG/TC/TBT PLLs (Ville)
- Fix HDCP repeater ctl register value on errors (Jani)
- Allow FBC with CCS modifiers on SKL+ (Ville)
- Fix HDCP GGTT pinning (Ville)
DRM core changes:
- Add ratelimited drm dbg print (Nirmoy)
- DPCD PSR early transport macro (Jouni)
Merges:
- Backmerge drm-next to bring Xe driver to drm-intel-next (Jani)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87cyt8cxsh.fsf@intel.com
|
|
JAZZ_ESP is a bool kconfig symbol that selects SCSI_SPI_ATTRS. When
CONFIG_SCSI=m, this results in SCSI_SPI_ATTRS=m while JAZZ_ESP=y, which
causes many undefined symbol linker errors.
Fix this by only offering to build this driver when CONFIG_SCSI=y.
[mkp: JAZZ_ESP is unique in that it does not support being compiled as a
module unlike the remaining SPI SCSI HBA drivers]
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20240214055953.9612-1-rdunlap@infradead.org
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: linux-mips@vger.kernel.org
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: James E.J. Bottomley <jejb@linux.ibm.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202402112222.Gl0udKyU-lkp@intel.com/
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Correct blk-mq registration issue with module parameter
disable_managed_interrupts enabled.
When we turn off the default PCI_IRQ_AFFINITY flag, the driver needs to
register with blk-mq using blk_mq_map_queues(). The driver is currently
calling blk_mq_pci_map_queues() which results in a stack trace and possibly
undefined behavior.
Stack Trace:
[ 7.860089] scsi host2: smartpqi
[ 7.871934] WARNING: CPU: 0 PID: 238 at block/blk-mq-pci.c:52 blk_mq_pci_map_queues+0xca/0xd0
[ 7.889231] Modules linked in: sd_mod t10_pi sg uas smartpqi(+) crc32c_intel scsi_transport_sas usb_storage dm_mirror dm_region_hash dm_log dm_mod ipmi_devintf ipmi_msghandler fuse
[ 7.924755] CPU: 0 PID: 238 Comm: kworker/0:3 Not tainted 4.18.0-372.88.1.el8_6_smartpqi_test.x86_64 #1
[ 7.944336] Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 03/08/2022
[ 7.963026] Workqueue: events work_for_cpu_fn
[ 7.978275] RIP: 0010:blk_mq_pci_map_queues+0xca/0xd0
[ 7.978278] Code: 48 89 de 89 c7 e8 f6 0f 4f 00 3b 05 c4 b7 8e 01 72 e1 5b 31 c0 5d 41 5c 41 5d 41 5e 41 5f e9 7d df 73 00 31 c0 e9 76 df 73 00 <0f> 0b eb bc 90 90 0f 1f 44 00 00 41 57 49 89 ff 41 56 41 55 41 54
[ 7.978280] RSP: 0018:ffffa95fc3707d50 EFLAGS: 00010216
[ 7.978283] RAX: 00000000ffffffff RBX: 0000000000000000 RCX: 0000000000000010
[ 7.978284] RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffff9190c32d4310
[ 7.978286] RBP: 0000000000000000 R08: ffffa95fc3707d38 R09: ffff91929b81ac00
[ 7.978287] R10: 0000000000000001 R11: ffffa95fc3707ac0 R12: 0000000000000000
[ 7.978288] R13: ffff9190c32d4000 R14: 00000000ffffffff R15: ffff9190c4c950a8
[ 7.978290] FS: 0000000000000000(0000) GS:ffff9193efc00000(0000) knlGS:0000000000000000
[ 7.978292] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8.172814] CR2: 000055d11166c000 CR3: 00000002dae10002 CR4: 00000000007706f0
[ 8.172816] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 8.172817] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 8.172818] PKRU: 55555554
[ 8.172819] Call Trace:
[ 8.172823] blk_mq_alloc_tag_set+0x12e/0x310
[ 8.264339] scsi_add_host_with_dma.cold.9+0x30/0x245
[ 8.279302] pqi_ctrl_init+0xacf/0xc8e [smartpqi]
[ 8.294085] ? pqi_pci_probe+0x480/0x4c8 [smartpqi]
[ 8.309015] pqi_pci_probe+0x480/0x4c8 [smartpqi]
[ 8.323286] local_pci_probe+0x42/0x80
[ 8.337855] work_for_cpu_fn+0x16/0x20
[ 8.351193] process_one_work+0x1a7/0x360
[ 8.364462] ? create_worker+0x1a0/0x1a0
[ 8.379252] worker_thread+0x1ce/0x390
[ 8.392623] ? create_worker+0x1a0/0x1a0
[ 8.406295] kthread+0x10a/0x120
[ 8.418428] ? set_kthread_struct+0x50/0x50
[ 8.431532] ret_from_fork+0x1f/0x40
[ 8.444137] ---[ end trace 1bf0173d39354506 ]---
Fixes: cf15c3e734e8 ("scsi: smartpqi: Add module param to disable managed ints")
Tested-by: Yogesh Chandra Pandey <YogeshChandra.Pandey@microchip.com>
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Link: https://lore.kernel.org/r/20240213162200.1875970-2-don.brace@microchip.com
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
There is one goto where "sched_clk_scaling_suspend_work" is true but
"scale_up" is uninitialized. It leads to a Smatch uninitialized variable
warning:
drivers/ufs/core/ufshcd.c:1589 ufshcd_devfreq_target() error: uninitialized symbol 'scale_up'.
Fixes: 1d969731b87f ("scsi: ufs: core: Only suspend clock scaling if scaling down")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/c787d37f-1107-4512-8991-bccf80e74a35@moroto.mountain
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
As of commit 066ff571011d ("block: turn bio_kmalloc into a simple kmalloc
wrapper"), a bio allocated by bio_kmalloc() must be freed by bio_uninit()
and kfree(). That is not done properly for the error case, hitting WARN and
NULL pointer dereference in bio_free().
Fixes: 066ff571011d ("block: turn bio_kmalloc into a simple kmalloc wrapper")
CC: stable@vger.kernel.org # 6.1+
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Link: https://lore.kernel.org/r/20240214144356.101814-1-naohiro.aota@wdc.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
Commit c92a6b5d6335 ("scsi: core: Query VPD size before getting full
page") removed the logic which checks whether a VPD page is present on
the supported pages list before asking for the page itself. That was
done because SPC helpfully states "The Supported VPD Pages VPD page
list may or may not include all the VPD pages that are able to be
returned by the device server". Testing had revealed a few devices
that supported some of the 0xBn pages but didn't actually list them in
page 0.
Julian Sikorski bisected a problem with his drive resetting during
discovery to the commit above. As it turns out, this particular drive
firmware will crash if we attempt to fetch page 0xB9.
Various approaches were attempted to work around this. In the end,
reinstating the logic that consults VPD page 0 before fetching any
other page was the path of least resistance. A firmware update for the
devices which originally compelled us to remove the check has since
been released.
Link: https://lore.kernel.org/r/20240214221411.2888112-1-martin.petersen@oracle.com
Fixes: c92a6b5d6335 ("scsi: core: Query VPD size before getting full page")
Cc: stable@vger.kernel.org
Cc: Bart Van Assche <bvanassche@acm.org>
Reported-by: Julian Sikorski <belegdol@gmail.com>
Tested-by: Julian Sikorski <belegdol@gmail.com>
Reviewed-by: Lee Duncan <lee.duncan@suse.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|