Age | Commit message (Collapse) | Author |
|
Commit b1529a41f777 "ocfs2: should reclaim the inode if
'__ocfs2_mknod_locked' returns an error" tried to reclaim the claimed
inode if __ocfs2_mknod_locked() fails later. But this introduce a race,
the freed bit may be reused immediately by another thread, which will
update dinode, e.g. i_generation. Then iput this inode will lead to BUG:
inode->i_generation != le32_to_cpu(fe->i_generation)
We could make this inode as bad, but we did want to do operations like
wipe in some cases. Since the claimed inode bit can only affect that an
dinode is missing and will return back after fsck, it seems not a big
problem. So just leave it as is by revert the reclaim logic.
Link: https://lkml.kernel.org/r/20221017130227.234480-1-joseph.qi@linux.alibaba.com
Fixes: b1529a41f777 ("ocfs2: should reclaim the inode if '__ocfs2_mknod_locked' returns an error")
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reported-by: Yan Wang <wangyan122@huawei.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Starting with GCC 12.1, the created .gcda format can't be read by gcov
tool. There are 2 significant changes to the .gcda file format that
need to be supported:
a) [gcov: Use system IO buffering]
(23eb66d1d46a34cb28c4acbdf8a1deb80a7c5a05) changed that all sizes in
the format are in bytes and not in words (4B)
b) [gcov: make profile merging smarter]
(72e0c742bd01f8e7e6dcca64042b9ad7e75979de) add a new checksum to the
file header.
Tested with GCC 7.5, 10.4, 12.2 and the current master.
Link: https://lkml.kernel.org/r/624bda92-f307-30e9-9aaa-8cc678b2dfb2@suse.cz
Signed-off-by: Martin Liska <mliska@suse.cz>
Tested-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Inside the zs_destroy_pool() function, there can still be NULL size_class
pointers: if when the next size_class is allocated, inside
zs_create_pool() function, kzalloc will return NULL and handling the error
condition, zs_create_pool() will call zs_destroy_pool().
Link: https://lkml.kernel.org/r/20221013112825.61869-1-avromanov@sberdevices.ru
Fixes: f24263a5a076 ("zsmalloc: remove unnecessary size_class NULL check")
Signed-off-by: Alexey Romanov <avromanov@sberdevices.ru>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Fuzzing produced an invalid argument to vma_merge() which was caught by
the newly added verification of the number of VMAs being removed on
process exit. Analyzing the failure eventually resulted in finding an
issue with the search of a VMA that started at address 0, which caused an
underflow and thus the loss of many VMAs being tracked in the tree. Fix
the underflow by changing the search of the maple tree to use the start
address directly.
Link: https://lkml.kernel.org/r/20221015021135.2816178-1-Liam.Howlett@oracle.com
Fixes: 66850be55e8e ("mm/mempolicy: use vma iterator & maple state instead of vma linked list")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Link: https://lore.kernel.org/r/202210052318.5ad10912-oliver.sang@intel.com
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Update my email address for old entry and add a new entry for my
contribution while working with arm to continue support that work.
Link: https://lkml.kernel.org/r/20221014141016.539625-1-qyousef@layalina.io
Signed-off-by: Qais Yousef <qyousef@layalina.io>
Acked-by: Qais Yousef <qais.yousef@arm.com>
Acked-by: Qais Yousef <qsyousef@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
My time at Oracle is ending at the end of the month. Update my email
address accordingly.
Link: https://lkml.kernel.org/r/Y0a+6+5SHMdvUnpg@kili
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
skb should be free in virtual_nci_send(), otherwise kmemleak will report
memleak.
Steps for reproduction (simulated in qemu):
cd tools/testing/selftests/nci
make
./nci_dev
BUG: memory leak
unreferenced object 0xffff888107588000 (size 208):
comm "nci_dev", pid 206, jiffies 4294945376 (age 368.248s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<000000008d94c8fd>] __alloc_skb+0x1da/0x290
[<00000000278bc7f8>] nci_send_cmd+0xa3/0x350
[<0000000081256a22>] nci_reset_req+0x6b/0xa0
[<000000009e721112>] __nci_request+0x90/0x250
[<000000005d556e59>] nci_dev_up+0x217/0x5b0
[<00000000e618ce62>] nfc_dev_up+0x114/0x220
[<00000000981e226b>] nfc_genl_dev_up+0x94/0xe0
[<000000009bb03517>] genl_family_rcv_msg_doit.isra.14+0x228/0x2d0
[<00000000b7f8c101>] genl_rcv_msg+0x35c/0x640
[<00000000c94075ff>] netlink_rcv_skb+0x11e/0x350
[<00000000440cfb1e>] genl_rcv+0x24/0x40
[<0000000062593b40>] netlink_unicast+0x43f/0x640
[<000000001d0b13cc>] netlink_sendmsg+0x73a/0xbf0
[<000000003272487f>] __sys_sendto+0x324/0x370
[<00000000ef9f1747>] __x64_sys_sendto+0xdd/0x1b0
[<000000001e437841>] do_syscall_64+0x3f/0x90
Fixes: e624e6c3e777 ("nfc: Add a virtual nci device driver")
Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20221020030505.15572-1-shangxiaojing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The `macb_resume`/`macb_suspend` methods already call the
`phylink_start`/`phylink_stop` methods during their execution so
explicitly say that the PM of the PHY is done by MAC by using the
`mac_managed_pm` flag of the `struct phylink_config`.
This also fixes the warning message issued during resume:
WARNING: CPU: 0 PID: 237 at drivers/net/phy/phy_device.c:323 mdio_bus_phy_resume+0x144/0x148
Depends-on: 96de900ae78e ("net: phylink: add mac_managed_pm in phylink_config structure")
Fixes: 744d23c71af3 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state")
Signed-off-by: Sergiu Moga <sergiu.moga@microchip.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20221019120929.63098-1-sergiu.moga@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Zhengchao Shao says:
====================
fix some issues in Huawei hinic driver
Fix some issues in Huawei hinic driver. This patchset is compiled only,
not tested.
====================
Link: https://lore.kernel.org/r/20221019095754.189119-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In hinic_vf_func_init(), if VF fails to register information with PF
through the MBOX, the MBOX callback function of VF is released once. But
it is released again in hinic_init_hwdev(). Remove one.
Fixes: 7dd29ee12865 ("hinic: add sriov feature support")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When hinic_set_cmdq_depth() fails in hinic_init_cmdqs(), the cmdq memory is
not released correctly. Fix it.
Fixes: 72ef908bb3ff ("hinic: add three net_device_ops of vf")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When the input parameter idx meets the expected case option in
hinic_dbg_get_func_table(), read_data is not released. Fix it.
Fixes: 5215e16244ee ("hinic: add support to query function table")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The value of lli_credit_cnt is incorrectly assigned, fix it.
Fixes: a0337c0dee68 ("hinic: add support to set and get irq coalesce")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Benjamin Poirier says:
====================
selftests: net: Fix problems in some drivers/net tests
Fix two problems mostly introduced in commit bbb774d921e2 ("net: Add tests
for bonding and team address list management").
====================
Link: https://lore.kernel.org/r/20221019091042.783786-1-bpoirier@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
lag_lib.sh creates the interfaces dummy1 and dummy2 whereas
dev_addr_lists.sh:destroy() deletes the interfaces dummy0 and dummy1. Fix
the mismatch in names.
Fixes: bbb774d921e2 ("net: Add tests for bonding and team address list management")
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Reviewed-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When exporting and running a subset of selftests via kselftest, files from
parts of the source tree which were not exported are not available. A few
tests are trying to source such files. Address the problem by using
symlinks.
The problem can be reproduced by running:
make -C tools/testing/selftests gen_tar TARGETS="drivers/net/bonding"
[... extract archive ...]
./run_kselftest.sh
or:
make kselftest KBUILD_OUTPUT=/tmp/kselftests TARGETS="drivers/net/bonding"
Fixes: bbb774d921e2 ("net: Add tests for bonding and team address list management")
Fixes: eccd0a80dc7f ("selftests: net: dsa: add a stress test for unlocked FDB operations")
Link: https://lore.kernel.org/netdev/40f04ded-0c86-8669-24b1-9a313ca21076@redhat.com/
Reported-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Reviewed-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently the rx drop is calculated as the sum of multiple HW drop
counters. The issue is that not all the HW drop counters were added for
the rx drop counter. So if for example you have a police that drops
frames, they were not see in the rx drop counter.
Fix this by updating how the rx drop counter is calculated. It is
required to add also RX_RED_PRIO_* HW counters.
Fixes: 12c2d0a5b8e2 ("net: lan966x: add ethtool configuration and statistics")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Link: https://lore.kernel.org/r/20221019083056.2744282-1-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If phy_device_register() fails, phy_device_free() need be called to
put refcount, so memory of phy device and device name can be freed
in callback function.
If get_phy_device() fails, mdiobus_unregister() need be called,
or it will cause warning in mdiobus_free() and kobject is leaked.
Fixes: 533dd11a12f6 ("net: socionext: Add Synquacer NetSec driver")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20221019064104.3228892-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
syzbot found a crash in tipc_topsrv_accept:
KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
Workqueue: tipc_rcv tipc_topsrv_accept
RIP: 0010:kernel_accept+0x22d/0x350 net/socket.c:3487
Call Trace:
<TASK>
tipc_topsrv_accept+0x197/0x280 net/tipc/topsrv.c:460
process_one_work+0x991/0x1610 kernel/workqueue.c:2289
worker_thread+0x665/0x1080 kernel/workqueue.c:2436
kthread+0x2e4/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
It was caused by srv->listener that might be set to null by
tipc_topsrv_stop() in net .exit whereas it's still used in
tipc_topsrv_accept() worker.
srv->listener is protected by srv->idr_lock in tipc_topsrv_stop(), so add
a check for srv->listener under srv->idr_lock in tipc_topsrv_accept() to
avoid the null-ptr-deref. To ensure the lsock is not released during the
tipc_topsrv_accept(), move sock_release() after tipc_topsrv_work_stop()
where it's waiting until the tipc_topsrv_accept worker to be done.
Note that sk_callback_lock is used to protect sk->sk_user_data instead of
srv->listener, and it should check srv in tipc_topsrv_listener_data_ready()
instead. This also ensures that no more tipc_topsrv_accept worker will be
started after tipc_conn_close() is called in tipc_topsrv_stop() where it
sets sk->sk_user_data to null.
Fixes: 0ef897be12b8 ("tipc: separate topology server listener socket from subcsriber sockets")
Reported-by: syzbot+c5ce866a8d30f4be0651@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Link: https://lore.kernel.org/r/4eee264380c409c61c6451af1059b7fb271a7e7b.1666120790.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The patchable_function_entry(5) might output 5 single nop
instructions (depends on toolchain), which will clash with
bpf_arch_text_poke check for 5 bytes nop instruction.
Adding early init call for dispatcher that checks and change
the patchable entry into expected 5 nop instruction if needed.
There's no need to take text_mutex, because we are using it
in early init call which is called at pre-smp time.
Fixes: ceea991a019c ("bpf: Move bpf_dispatcher function out of ftrace locations")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221018075934.574415-1-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Pull drm fixes from Dave Airlie:
"Usual fixes for the week.
The amdgpu contains fixes for two regressions, one reported in
response to rc1 which broke on SI GPUs, and one gfx9 APU regression.
Otherwise it's mostly fixes for new IP, and some GPU reset fixes. vc4
is just HDMI fixes, and panfrost has some mnor types fixes.
Core:
- fix connector DDC pointer
- fix buffer overflow in format_helper_test
amdgpu:
- Mode2 reset fixes for Sienna Cichlid
- Revert broken fan speed sensor fix
- SMU 13.x fixes
- GC 11.x fixes
- RAS fixes
- SR-IOV fixes
- Fix BO move breakage on SI
- Misc compiler fixes
- Fix gfx9 APU regression caused by PCI AER fix
vc4:
- HDMI fixes
panfrost:
- compiler fixes"
* tag 'drm-fixes-2022-10-21' of git://anongit.freedesktop.org/drm/drm: (35 commits)
drm/amdgpu: fix sdma doorbell init ordering on APUs
drm/panfrost: replace endian-specific types with native ones
drm/panfrost: Remove type name from internal structs
drm/connector: Set DDC pointer in drmm_connector_init
drm: tests: Fix a buffer overflow in format_helper_test
drm/amdgpu: use DRM_SCHED_FENCE_DONT_PIPELINE for VM updates
drm/sched: add DRM_SCHED_FENCE_DONT_PIPELINE flag
drm/amdgpu: Fix for BO move issue
drm/amdgpu: dequeue mes scheduler during fini
drm/amd/pm: enable thermal alert on smu_v13_0_10
drm/amdgpu: Program GC registers through RLCG interface in gfx_v11/gmc_v11
drm/amdkfd: Fix type of reset_type parameter in hqd_destroy() callback
drm/amd/display: Increase frame size limit for display_mode_vba_util_32.o
drm/amd/pm: add SMU IP v13.0.4 IF version define to V7
drm/amd/pm: update SMU IP v13.0.4 driver interface version
drm/amd/pm: Init pm_attr_list when dpm is disabled
drm/amd/pm: disable cstate feature for gpu reset scenario
drm/amd/pm: fulfill SMU13.0.7 cstate control interface
drm/amd/pm: fulfill SMU13.0.0 cstate control interface
drm/amdgpu: Add sriov vf ras support in amdgpu_ras_asic_supported
...
|
|
Include the patches that weren't included in the 6.1 pull request.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from netfilter.
Current release - regressions:
- revert "net: fix cpu_max_bits_warn() usage in
netif_attrmask_next{,_and}"
- revert "net: sched: fq_codel: remove redundant resource cleanup in
fq_codel_init()"
- dsa: uninitialized variable in dsa_slave_netdevice_event()
- eth: sunhme: uninitialized variable in happy_meal_init()
Current release - new code bugs:
- eth: octeontx2: fix resource not freed after malloc
Previous releases - regressions:
- sched: fix return value of qdisc ingress handling on success
- sched: fix race condition in qdisc_graft()
- udp: update reuse->has_conns under reuseport_lock.
- tls: strp: make sure the TCP skbs do not have overlapping data
- hsr: avoid possible NULL deref in skb_clone()
- tipc: fix an information leak in tipc_topsrv_kern_subscr
- phylink: add mac_managed_pm in phylink_config structure
- eth: i40e: fix DMA mappings leak
- eth: hyperv: fix a RX-path warning
- eth: mtk: fix memory leaks
Previous releases - always broken:
- sched: cake: fix null pointer access issue when cake_init() fails"
* tag 'net-6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (43 commits)
net: phy: dp83822: disable MDI crossover status change interrupt
net: sched: fix race condition in qdisc_graft()
net: hns: fix possible memory leak in hnae_ae_register()
wwan_hwsim: fix possible memory leak in wwan_hwsim_dev_new()
sfc: include vport_id in filter spec hash and equal()
genetlink: fix kdoc warnings
selftests: add selftest for chaining of tc ingress handling to egress
net: Fix return value of qdisc ingress handling on success
net: sched: sfb: fix null pointer access issue when sfb_init() fails
Revert "net: sched: fq_codel: remove redundant resource cleanup in fq_codel_init()"
net: sched: cake: fix null pointer access issue when cake_init() fails
ethernet: marvell: octeontx2 Fix resource not freed after malloc
netfilter: nf_tables: relax NFTA_SET_ELEM_KEY_END set flags requirements
netfilter: rpfilter/fib: Set ->flowic_uid correctly for user namespaces.
ionic: catch NULL pointer issue on reconfig
net: hsr: avoid possible NULL deref in skb_clone()
bnxt_en: fix memory leak in bnxt_nvm_test()
ip6mr: fix UAF issue in ip6mr_sk_done() when addrconf_init_net() failed
udp: Update reuse->has_conns under reuseport_lock.
net: ethernet: mediatek: ppe: Remove the unused function mtk_foe_entry_usable()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull ata fixes from Damien Le Moal:
"Several minor fixes:
- Fix the module alias for the ahci_imx driver to get autoloading to
work (Alexander)
- Fix a potential array-index-out-of-bounds problem with the
enclosure managment support in the ahci driver (Kai-Heng)
- Several patches to fix compilation warnings thrown by clang in the
ahci_st, sata_rcar, ahci_brcm, ahci_xgene, ahci_imx and ahci_qoriq
drivers (me)"
* tag 'ata-6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
ata: ahci_qoriq: Fix compilation warning
ata: ahci_imx: Fix compilation warning
ata: ahci_xgene: Fix compilation warning
ata: ahci_brcm: Fix compilation warning
ata: sata_rcar: Fix compilation warning
ata: ahci_st: Fix compilation warning
ata: ahci: Match EM_MAX_SLOTS with SATA_PMP_MAX_PORTS
ata: ahci-imx: Fix MODULE_ALIAS
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper updates from Mike Snitzer:
- Fix dm-bufio to use test_bit_acquire to properly test_bit on arches
with weaker memory ordering.
- DM core replace DMWARN with DMERR or DMCRIT for fatal errors.
- Enable WQ_HIGHPRI on DM verity target's verify_wq.
- Add documentation for DM verity's try_verify_in_tasklet option.
- Various typo and redundant word fixes in code and/or comments.
* tag 'for-6.1/dm-changes-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm clone: Fix typo in block_device format specifier
dm: remove unnecessary assignment statement in alloc_dev()
dm verity: Add documentation for try_verify_in_tasklet option
dm cache: delete the redundant word 'each' in comment
dm raid: fix typo in analyse_superblocks code comment
dm verity: enable WQ_HIGHPRI on verify_wq
dm raid: delete the redundant word 'that' in comment
dm: change from DMWARN to DMERR or DMCRIT for fatal errors
dm bufio: use the acquire memory barrier when testing for B_READING
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
drm-misc-fixes for v6.1-rc2:
- Fix a buffer overflow in format_helper_test.
- Set DDC pointer in drmm_connector_init.
- Compiler fixes for panfrost.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/c4d05683-8ebe-93b8-d24c-d1d2c68f12c4@linux.intel.com
|
|
Making polled RCU grace periods account for expedited grace periods
required acquiring the leaf rcu_node structure's lock during early boot,
but after rcu_init() was called. This lock is irq-disabled, but the
code incorrectly assumes that irqs are always disabled when invoking
synchronize_rcu(). The exception is early boot before the scheduler has
started, which means that upon return from synchronize_rcu(), irqs will
be incorrectly enabled.
This commit fixes this bug by using irqsave/irqrestore locking primitives.
Fixes: bf95b2bc3e42 ("rcu: Switch polled grace-period APIs to ->gp_seq_polled")
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.1-2022-10-20:
amdgpu:
- Fix gfx9 APU regression caused by PCI AER fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221020135225.562807-1-alexander.deucher@amd.com
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.1-2022-10-19:
amdgpu:
- Mode2 reset fixes for Sienna Cichlid
- Revert broken fan speed sensor fix
- SMU 13.x fixes
- GC 11.x fixes
- RAS fixes
- SR-IOV fixes
- Fix BO move breakage on SI
- Misc compiler fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221019191357.6208-1-alexander.deucher@amd.com
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Short summary of fixes pull:
* vc4: HDMI fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/Y0gGdlujszCstDeP@linux-uq9g
|
|
cti_enable_hw() and cti_disable_hw() are called from an atomic context
so shouldn't use runtime PM because it can result in a sleep when
communicating with firmware.
Since commit 3c6656337852 ("Revert "firmware: arm_scmi: Add clock
management to the SCMI power domain""), this causes a hang on Juno when
running the Perf Coresight tests or running this command:
perf record -e cs_etm//u -- ls
This was also missed until the revert commit because pm_runtime_put()
was called with the wrong device until commit 692c9a499b28 ("coresight:
cti: Correct the parameter for pm_runtime_put")
With lock and scheduler debugging enabled the following is output:
coresight cti_sys0: cti_enable_hw -- dev:cti_sys0 parent: 20020000.cti
BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:1151
in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 330, name: perf-exec
preempt_count: 2, expected: 0
RCU nest depth: 0, expected: 0
INFO: lockdep is turned off.
irq event stamp: 0
hardirqs last enabled at (0): [<0000000000000000>] 0x0
hardirqs last disabled at (0): [<ffff80000822b394>] copy_process+0xa0c/0x1948
softirqs last enabled at (0): [<ffff80000822b394>] copy_process+0xa0c/0x1948
softirqs last disabled at (0): [<0000000000000000>] 0x0
CPU: 3 PID: 330 Comm: perf-exec Not tainted 6.0.0-00053-g042116d99298 #7
Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Sep 13 2022
Call trace:
dump_backtrace+0x134/0x140
show_stack+0x20/0x58
dump_stack_lvl+0x8c/0xb8
dump_stack+0x18/0x34
__might_resched+0x180/0x228
__might_sleep+0x50/0x88
__pm_runtime_resume+0xac/0xb0
cti_enable+0x44/0x120
coresight_control_assoc_ectdev+0xc0/0x150
coresight_enable_path+0xb4/0x288
etm_event_start+0x138/0x170
etm_event_add+0x48/0x70
event_sched_in.isra.122+0xb4/0x280
merge_sched_in+0x1fc/0x3d0
visit_groups_merge.constprop.137+0x16c/0x4b0
ctx_sched_in+0x114/0x1f0
perf_event_sched_in+0x60/0x90
ctx_resched+0x68/0xb0
perf_event_exec+0x138/0x508
begin_new_exec+0x52c/0xd40
load_elf_binary+0x6b8/0x17d0
bprm_execve+0x360/0x7f8
do_execveat_common.isra.47+0x218/0x238
__arm64_sys_execve+0x48/0x60
invoke_syscall+0x4c/0x110
el0_svc_common.constprop.4+0xfc/0x120
do_el0_svc+0x34/0xc0
el0_svc+0x40/0x98
el0t_64_sync_handler+0x98/0xc0
el0t_64_sync+0x170/0x174
Fix the issue by removing the runtime PM calls completely. They are not
needed here because it must have already been done when building the
path for a trace.
Fixes: 835d722ba10a ("coresight: cti: Initial CoreSight CTI Driver")
Reported-by: Aishwarya TCV <Aishwarya.TCV@arm.com>
Reported-by: Cristian Marussi <Cristian.Marussi@arm.com>
Suggested-by: Suzuki Poulose <Suzuki.Poulose@arm.com>
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Tested-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20221005131452.1506328-1-james.clark@arm.com
|
|
With lockdeps enabled, we get the following warning:
======================================================
WARNING: possible circular locking dependency detected
------------------------------------------------------
kworker/u12:1/53 is trying to acquire lock:
ffff80000adce220 (coresight_mutex){+.+.}-{4:4}, at: coresight_set_assoc_ectdev_mutex+0x3c/0x5c
but task is already holding lock:
ffff80000add1f60 (ect_mutex){+.+.}-{4:4}, at: cti_probe+0x318/0x394
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (ect_mutex){+.+.}-{4:4}:
__mutex_lock_common+0xd8/0xe60
mutex_lock_nested+0x44/0x50
cti_add_assoc_to_csdev+0x4c/0x184
coresight_register+0x2f0/0x314
tmc_probe+0x33c/0x414
-> #0 (coresight_mutex){+.+.}-{4:4}:
__lock_acquire+0x1a20/0x32d0
lock_acquire+0x160/0x308
__mutex_lock_common+0xd8/0xe60
mutex_lock_nested+0x44/0x50
coresight_set_assoc_ectdev_mutex+0x3c/0x5c
cti_update_conn_xrefs+0x6c/0xf8
cti_probe+0x33c/0x394
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(ect_mutex);
lock(coresight_mutex);
lock(ect_mutex);
lock(coresight_mutex);
*** DEADLOCK ***
4 locks held by kworker/u12:1/53:
#0: ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1fc/0x63c
#1: (deferred_probe_work){+.+.}-{0:0}, at: process_one_work+0x228/0x63c
#2: (&dev->mutex){....}-{4:4}, at: __device_attach+0x48/0x1a8
#3: (ect_mutex){+.+.}-{4:4}, at: cti_probe+0x318/0x394
To fix the same, call cti_add_assoc_to_csdev without the holding
coresight_mutex and confine the locking while setting the associated
ect / cti device using coresight_set_assoc_ectdev_mutex().
Fixes: 177af8285b59 ("coresight: cti: Enable CTI associated with devices")
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20220721130329.3787211-1-sudeep.holla@arm.com
|
|
Reported-by: k2ci <kernel-bot@kylinos.cn>
Signed-off-by: Jiangshan Yi <yijiangshan@kylinos.cn>
Reviewed-by: Jean Delvare <jdelvare@suse.de> # for sis630
Signed-off-by: Wolfram Sang <wsa@kernel.org>
|
|
Monitor's automata definition is only used locally, so make dot2c generate
a static definition.
Link: https://lore.kernel.org/all/202208210332.gtHXje45-lkp@intel.com
Link: https://lore.kernel.org/all/202208210358.6HH3OrVs-lkp@intel.com
Link: https://lkml.kernel.org/r/ffbb92010f643307766c9307fd42f416e5b85fa0.1661266564.git.bristot@kernel.org
Cc: Steven Rostedt <rostedt@goodmis.org>
Fixes: e3c9fc78f096 ("tools/rv: Add dot2c")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
clear_cpu_cap(&boot_cpu_data) is very similar to setup_clear_cpu_cap()
except that the latter also sets a bit in 'cpu_caps_cleared' which
later clears the same cap in secondary cpus, which is likely what is
meant here.
Fixes: 47125db27e47 ("perf/x86/intel/lbr: Support Architectural LBR")
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lkml.kernel.org/r/20220718141123.136106-2-mlevitsk@redhat.com
|
|
Different function signatures means they needs to be different
functions; otherwise CFI gets upset.
As triggered by the ftrace boot tests:
[] CFI failure at ftrace_return_to_handler+0xac/0x16c (target: ftrace_stub+0x0/0x14; expected type: 0x0a5d5347)
Fixes: 3c516f89e17e ("x86: Add support for CONFIG_CFI_CLANG")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lkml.kernel.org/r/Y06dg4e1xF6JTdQq@hirez.programming.kicks-ass.net
|
|
Remove the weird jumps to RET and simply use RET.
This then promotes ftrace_stub() to a real function; which becomes
important for kcfi.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220915111148.719080593@infradead.org
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
|
|
Commit 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
uncovered a bug in amdgpu that required a reordering of the driver
init sequence to avoid accessing a special register on the GPU
before it was properly set up leading to an PCI AER error. This
reordering uncovered a different hw programming ordering dependency
in some APUs where the SDMA doorbells need to be programmed before
the GFX doorbells. To fix this, move the SDMA doorbell programming
back into the soc15 common code, but use the actual doorbell range
values directly rather than the values stored in the ring structure
since those will not be initialized at this point.
This is a partial revert, but with the doorbell assignment
fixed so the proper doorbell index is set before it's used.
Fixes: e3163bc8ffdfdb ("drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for vega")
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: skhan@linuxfoundation.org
Cc: stable@vger.kernel.org
|
|
As previous commit, 'blk_trace_cleanup' will stop block trace if
block trace's state is 'Blktrace_running'.
So remove unnessary stop block trace in 'blk_trace_shutdown'.
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221019033602.752383-4-yebin@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
When test as follows:
step1: ioctl(sda, BLKTRACESETUP, &arg)
step2: ioctl(sda, BLKTRACESTART, NULL)
step3: ioctl(sda, BLKTRACETEARDOWN, NULL)
step4: ioctl(sda, BLKTRACESETUP, &arg)
Got issue as follows:
debugfs: File 'dropped' in directory 'sda' already present!
debugfs: File 'msg' in directory 'sda' already present!
debugfs: File 'trace0' in directory 'sda' already present!
And also find syzkaller report issue like "KASAN: use-after-free Read in relay_switch_subbuf"
"https://syzkaller.appspot.com/bug?id=13849f0d9b1b818b087341691be6cc3ac6a6bfb7"
If remove block trace without stop(BLKTRACESTOP) block trace, '__blk_trace_remove'
will just set 'q->blk_trace' with NULL. However, debugfs file isn't removed, so
will report file already present when call BLKTRACESETUP.
static int __blk_trace_remove(struct request_queue *q)
{
struct blk_trace *bt;
bt = rcu_replace_pointer(q->blk_trace, NULL,
lockdep_is_held(&q->debugfs_mutex));
if (!bt)
return -EINVAL;
if (bt->trace_state != Blktrace_running)
blk_trace_cleanup(q, bt);
return 0;
}
If do test as follows:
step1: ioctl(sda, BLKTRACESETUP, &arg)
step2: ioctl(sda, BLKTRACESTART, NULL)
step3: ioctl(sda, BLKTRACETEARDOWN, NULL)
step4: remove sda
There will remove debugfs directory which will remove recursively all file
under directory.
>> blk_release_queue
>> debugfs_remove_recursive(q->debugfs_dir)
So all files which created in 'do_blk_trace_setup' are removed, and
'dentry->d_inode' is NULL. But 'q->blk_trace' is still in 'running_trace_lock',
'trace_note_tsk' will traverse 'running_trace_lock' all nodes.
>>trace_note_tsk
>> trace_note
>> relay_reserve
>> relay_switch_subbuf
>> d_inode(buf->dentry)->i_size
To solve above issues, reference commit '5afedf670caf', call 'blk_trace_cleanup'
unconditionally in '__blk_trace_remove' and first stop block trace in
'blk_trace_cleanup'.
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221019033602.752383-3-yebin@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Introduce 'blk_trace_{start,stop}' helper. No functional changed.
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221019033602.752383-2-yebin@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
bio_put() with REQ_ALLOC_CACHE assumes that it's executed not from
an irq context. Let's add a warning if the invariant is not respected,
especially since there is a couple of places removing REQ_POLLED by hand
without also clearing REQ_ALLOC_CACHE.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/558d78313476c4e9c233902efa0092644c3d420a.1666122465.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
If the CPU mask allocation for a node fails, then the memory allocated for
the 'io_wqe' struct of the current node doesn't get freed on the error
handling path, since it has not yet been added to the 'wqes' array.
This was spotted when fuzzing v6.1-rc1 with Syzkaller:
BUG: memory leak
unreferenced object 0xffff8880093d5000 (size 1024):
comm "syz-executor.2", pid 7701, jiffies 4295048595 (age 13.900s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<00000000cb463369>] __kmem_cache_alloc_node+0x18e/0x720
[<00000000147a3f9c>] kmalloc_node_trace+0x2a/0x130
[<000000004e107011>] io_wq_create+0x7b9/0xdc0
[<00000000c38b2018>] io_uring_alloc_task_context+0x31e/0x59d
[<00000000867399da>] __io_uring_add_tctx_node.cold+0x19/0x1ba
[<000000007e0e7a79>] io_uring_setup.cold+0x1b80/0x1dce
[<00000000b545e9f6>] __x64_sys_io_uring_setup+0x5d/0x80
[<000000008a8a7508>] do_syscall_64+0x5d/0x90
[<000000004ac08bec>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fixes: 0e03496d1967 ("io-wq: use private CPU mask")
Cc: stable@vger.kernel.org
Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com>
Link: https://lore.kernel.org/r/20221020014710.902201-1-rafaelmendsr@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
it defined in d0edc2473be9d, but there's nowhere to use it,
so remove it.
Signed-off-by: Yuwei Guan <Yuwei.Guan@zeekrlife.com>
Acked-by: Paolo Valente <paolo.valente@linaro.org>
Link: https://lore.kernel.org/r/20221018030139.159-1-Yuwei.Guan@zeekrlife.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Commit c347a787e34cb (drbd: set ->bi_bdev in drbd_req_new) moved a
bio_set_dev call (which has since been removed) to "earlier", from
drbd_request_prepare to drbd_req_new.
The problem is that this accesses device->ldev->backing_bdev, which is
not NULL-checked at this point. When we don't have an ldev (i.e. when
the DRBD device is diskless), this leads to a null pointer deref.
So, only allocate the private_bio if we actually have a disk. This is
also a small optimization, since we don't clone the bio to only to
immediately free it again in the diskless case.
Fixes: c347a787e34cb ("drbd: set ->bi_bdev in drbd_req_new")
Co-developed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Co-developed-by: Joel Colledge <joel.colledge@linbit.com>
Signed-off-by: Joel Colledge <joel.colledge@linbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221020085205.129090-1-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 6.1
- fix nvme-hwmon for DMA non-cohehrent architectures (Serge Semin)
- add a nvme-hwmong maintainer (Christoph Hellwig)
- fix error pointer dereference in error handling (Dan Carpenter)
- fix invalid memory reference in nvmet_subsys_attr_qid_max_show
(Daniel Wagner)
- don't limit the DMA segment size in nvme-apple (Russell King)
- fix workqueue MEM_RECLAIM flushing dependency (Sagi Grimberg)
- disable write zeroes on various Kingston SSDs (Xander Li)"
* tag 'nvme-6.1-2022-10-22' of git://git.infradead.org/nvme:
nvmet: fix invalid memory reference in nvmet_subsys_attr_qid_max_show
nvmet: fix workqueue MEM_RECLAIM flushing dependency
nvme-hwmon: kmalloc the NVME SMART log buffer
nvme-hwmon: consistently ignore errors from nvme_hwmon_init
nvme: add Guenther as nvme-hwmon maintainer
nvme-apple: don't limit DMA segement size
nvme-pci: disable write zeroes on various Kingston SSD
nvme: fix error pointer dereference in error handling
|
|
If device_register() fails in snd_ac97_dev_register(), it should
call put_device() to give up reference, or the name allocated in
dev_set_name() is leaked.
Fixes: 0ca06a00e206 ("[ALSA] AC97 bus interface for ad-hoc drivers")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20221019093025.1179475-1-yangyingliang@huawei.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Update the driver to use an immutable IRQ chip to fix this warning:
"not an immutable chip, please consider fixing it!"
Preserve per-chip labels by adding an ->irq_print_chip() callback.
Tested-by: Svyatoslav Ryhel <clamor95@gmail.com> # TF201 T30
Tested-by: Robert Eckelmann <longnoserob@gmail.com> # TF101 T20
Signed-off-by: Svyatoslav Ryhel <clamor95@gmail.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
|
|
__le32 and __le64 types aren't portable and are not available on
FreeBSD (which uses the same uAPI).
Instead of attempting to always output little endian, just use native
endianness in the dumps. Tools can detect the endianness in use by
looking at the 'magic' field, but equally we don't expect big-endian to
be used with Mali (there are no known implementations out there).
Bug: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7252
Fixes: 730c2bf4ad39 ("drm/panfrost: Add support for devcoredump")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221017104602.142992-3-steven.price@arm.com
|
|
The two structs internal to struct panfrost_dump_object_header were
named, but sadly that is incompatible with C++, causing an error: "an
anonymous union may only have public non-static data members".
However nothing refers to struct pan_reg_hdr and struct pan_bomap_hdr
and there's no need to export these definitions, so lets drop them. This
fixes the C++ build error with the minimum change in userspace API.
Reported-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Fixes: 730c2bf4ad39 ("drm/panfrost: Add support for devcoredump")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221017104602.142992-2-steven.price@arm.com
|