summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-06-27selftest: af_unix: Remove test_unix_oob.c.Kuniyuki Iwashima
test_unix_oob.c does not fully cover AF_UNIX's MSG_OOB functionality, thus there are discrepancies between TCP behaviour. Also, the test uses fork() to create message producer, and it's not easy to understand and add more test cases. Let's remove test_unix_oob.c and rewrite a new test. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-27tracing/net_sched: NULL pointer dereference in perf_trace_qdisc_reset()Yunseong Kim
In the TRACE_EVENT(qdisc_reset) NULL dereference occurred from qdisc->dev_queue->dev <NULL> ->name This situation simulated from bunch of veths and Bluetooth disconnection and reconnection. During qdisc initialization, qdisc was being set to noop_queue. In veth_init_queue, the initial tx_num was reduced back to one, causing the qdisc reset to be called with noop, which led to the kernel panic. I've attached the GitHub gist link that C converted syz-execprogram source code and 3 log of reproduced vmcore-dmesg. https://gist.github.com/yskelg/cc64562873ce249cdd0d5a358b77d740 Yeoreum and I use two fuzzing tool simultaneously. One process with syz-executor : https://github.com/google/syzkaller $ ./syz-execprog -executor=./syz-executor -repeat=1 -sandbox=setuid \ -enable=none -collide=false log1 The other process with perf fuzzer: https://github.com/deater/perf_event_tests/tree/master/fuzzer $ perf_event_tests/fuzzer/perf_fuzzer I think this will happen on the kernel version. Linux kernel version +v6.7.10, +v6.8, +v6.9 and it could happen in v6.10. This occurred from 51270d573a8d. I think this patch is absolutely necessary. Previously, It was showing not intended string value of name. I've reproduced 3 time from my fedora 40 Debug Kernel with any other module or patched. version: 6.10.0-0.rc2.20240608gitdc772f8237f9.29.fc41.aarch64+debug [ 5287.164555] veth0_vlan: left promiscuous mode [ 5287.164929] veth1_macvtap: left promiscuous mode [ 5287.164950] veth0_macvtap: left promiscuous mode [ 5287.164983] veth1_vlan: left promiscuous mode [ 5287.165008] veth0_vlan: left promiscuous mode [ 5287.165450] veth1_macvtap: left promiscuous mode [ 5287.165472] veth0_macvtap: left promiscuous mode [ 5287.165502] veth1_vlan: left promiscuous mode … [ 5297.598240] bridge0: port 2(bridge_slave_1) entered blocking state [ 5297.598262] bridge0: port 2(bridge_slave_1) entered forwarding state [ 5297.598296] bridge0: port 1(bridge_slave_0) entered blocking state [ 5297.598313] bridge0: port 1(bridge_slave_0) entered forwarding state [ 5297.616090] 8021q: adding VLAN 0 to HW filter on device bond0 [ 5297.620405] bridge0: port 1(bridge_slave_0) entered disabled state [ 5297.620730] bridge0: port 2(bridge_slave_1) entered disabled state [ 5297.627247] 8021q: adding VLAN 0 to HW filter on device team0 [ 5297.629636] bridge0: port 1(bridge_slave_0) entered blocking state … [ 5298.002798] bridge_slave_0: left promiscuous mode [ 5298.002869] bridge0: port 1(bridge_slave_0) entered disabled state [ 5298.309444] bond0 (unregistering): (slave bond_slave_0): Releasing backup interface [ 5298.315206] bond0 (unregistering): (slave bond_slave_1): Releasing backup interface [ 5298.320207] bond0 (unregistering): Released all slaves [ 5298.354296] hsr_slave_0: left promiscuous mode [ 5298.360750] hsr_slave_1: left promiscuous mode [ 5298.374889] veth1_macvtap: left promiscuous mode [ 5298.374931] veth0_macvtap: left promiscuous mode [ 5298.374988] veth1_vlan: left promiscuous mode [ 5298.375024] veth0_vlan: left promiscuous mode [ 5299.109741] team0 (unregistering): Port device team_slave_1 removed [ 5299.185870] team0 (unregistering): Port device team_slave_0 removed … [ 5300.155443] Bluetooth: hci3: unexpected cc 0x0c03 length: 249 > 1 [ 5300.155724] Bluetooth: hci3: unexpected cc 0x1003 length: 249 > 9 [ 5300.155988] Bluetooth: hci3: unexpected cc 0x1001 length: 249 > 9 …. [ 5301.075531] team0: Port device team_slave_1 added [ 5301.085515] bridge0: port 1(bridge_slave_0) entered blocking state [ 5301.085531] bridge0: port 1(bridge_slave_0) entered disabled state [ 5301.085588] bridge_slave_0: entered allmulticast mode [ 5301.085800] bridge_slave_0: entered promiscuous mode [ 5301.095617] bridge0: port 1(bridge_slave_0) entered blocking state [ 5301.095633] bridge0: port 1(bridge_slave_0) entered disabled state … [ 5301.149734] bond0: (slave bond_slave_0): Enslaving as an active interface with an up link [ 5301.173234] bond0: (slave bond_slave_0): Enslaving as an active interface with an up link [ 5301.180517] bond0: (slave bond_slave_1): Enslaving as an active interface with an up link [ 5301.193481] hsr_slave_0: entered promiscuous mode [ 5301.204425] hsr_slave_1: entered promiscuous mode [ 5301.210172] debugfs: Directory 'hsr0' with parent 'hsr' already present! [ 5301.210185] Cannot create hsr debugfs directory [ 5301.224061] bond0: (slave bond_slave_1): Enslaving as an active interface with an up link [ 5301.246901] bond0: (slave bond_slave_0): Enslaving as an active interface with an up link [ 5301.255934] team0: Port device team_slave_0 added [ 5301.256480] team0: Port device team_slave_1 added [ 5301.256948] team0: Port device team_slave_0 added … [ 5301.435928] hsr_slave_0: entered promiscuous mode [ 5301.446029] hsr_slave_1: entered promiscuous mode [ 5301.455872] debugfs: Directory 'hsr0' with parent 'hsr' already present! [ 5301.455884] Cannot create hsr debugfs directory [ 5301.502664] hsr_slave_0: entered promiscuous mode [ 5301.513675] hsr_slave_1: entered promiscuous mode [ 5301.526155] debugfs: Directory 'hsr0' with parent 'hsr' already present! [ 5301.526164] Cannot create hsr debugfs directory [ 5301.563662] hsr_slave_0: entered promiscuous mode [ 5301.576129] hsr_slave_1: entered promiscuous mode [ 5301.580259] debugfs: Directory 'hsr0' with parent 'hsr' already present! [ 5301.580270] Cannot create hsr debugfs directory [ 5301.590269] 8021q: adding VLAN 0 to HW filter on device bond0 [ 5301.595872] KASAN: null-ptr-deref in range [0x0000000000000130-0x0000000000000137] [ 5301.595877] Mem abort info: [ 5301.595881] ESR = 0x0000000096000006 [ 5301.595885] EC = 0x25: DABT (current EL), IL = 32 bits [ 5301.595889] SET = 0, FnV = 0 [ 5301.595893] EA = 0, S1PTW = 0 [ 5301.595896] FSC = 0x06: level 2 translation fault [ 5301.595900] Data abort info: [ 5301.595903] ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000 [ 5301.595907] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 5301.595911] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 5301.595915] [dfff800000000026] address between user and kernel address ranges [ 5301.595971] Internal error: Oops: 0000000096000006 [#1] SMP … [ 5301.596076] CPU: 2 PID: 102769 Comm: syz-executor.3 Kdump: loaded Tainted: G W ------- --- 6.10.0-0.rc2.20240608gitdc772f8237f9.29.fc41.aarch64+debug #1 [ 5301.596080] Hardware name: VMware, Inc. VMware20,1/VBSA, BIOS VMW201.00V.21805430.BA64.2305221830 05/22/2023 [ 5301.596082] pstate: 01400005 (nzcv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) [ 5301.596085] pc : strnlen+0x40/0x88 [ 5301.596114] lr : trace_event_get_offsets_qdisc_reset+0x6c/0x2b0 [ 5301.596124] sp : ffff8000beef6b40 [ 5301.596126] x29: ffff8000beef6b40 x28: dfff800000000000 x27: 0000000000000001 [ 5301.596131] x26: 6de1800082c62bd0 x25: 1ffff000110aa9e0 x24: ffff800088554f00 [ 5301.596136] x23: ffff800088554ec0 x22: 0000000000000130 x21: 0000000000000140 [ 5301.596140] x20: dfff800000000000 x19: ffff8000beef6c60 x18: ffff7000115106d8 [ 5301.596143] x17: ffff800121bad000 x16: ffff800080020000 x15: 0000000000000006 [ 5301.596147] x14: 0000000000000002 x13: ffff0001f3ed8d14 x12: ffff700017ddeda5 [ 5301.596151] x11: 1ffff00017ddeda4 x10: ffff700017ddeda4 x9 : ffff800082cc5eec [ 5301.596155] x8 : 0000000000000004 x7 : 00000000f1f1f1f1 x6 : 00000000f2f2f200 [ 5301.596158] x5 : 00000000f3f3f3f3 x4 : ffff700017dded80 x3 : 00000000f204f1f1 [ 5301.596162] x2 : 0000000000000026 x1 : 0000000000000000 x0 : 0000000000000130 [ 5301.596166] Call trace: [ 5301.596175] strnlen+0x40/0x88 [ 5301.596179] trace_event_get_offsets_qdisc_reset+0x6c/0x2b0 [ 5301.596182] perf_trace_qdisc_reset+0xb0/0x538 [ 5301.596184] __traceiter_qdisc_reset+0x68/0xc0 [ 5301.596188] qdisc_reset+0x43c/0x5e8 [ 5301.596190] netif_set_real_num_tx_queues+0x288/0x770 [ 5301.596194] veth_init_queues+0xfc/0x130 [veth] [ 5301.596198] veth_newlink+0x45c/0x850 [veth] [ 5301.596202] rtnl_newlink_create+0x2c8/0x798 [ 5301.596205] __rtnl_newlink+0x92c/0xb60 [ 5301.596208] rtnl_newlink+0xd8/0x130 [ 5301.596211] rtnetlink_rcv_msg+0x2e0/0x890 [ 5301.596214] netlink_rcv_skb+0x1c4/0x380 [ 5301.596225] rtnetlink_rcv+0x20/0x38 [ 5301.596227] netlink_unicast+0x3c8/0x640 [ 5301.596231] netlink_sendmsg+0x658/0xa60 [ 5301.596234] __sock_sendmsg+0xd0/0x180 [ 5301.596243] __sys_sendto+0x1c0/0x280 [ 5301.596246] __arm64_sys_sendto+0xc8/0x150 [ 5301.596249] invoke_syscall+0xdc/0x268 [ 5301.596256] el0_svc_common.constprop.0+0x16c/0x240 [ 5301.596259] do_el0_svc+0x48/0x68 [ 5301.596261] el0_svc+0x50/0x188 [ 5301.596265] el0t_64_sync_handler+0x120/0x130 [ 5301.596268] el0t_64_sync+0x194/0x198 [ 5301.596272] Code: eb15001f 54000120 d343fc02 12000801 (38f46842) [ 5301.596285] SMP: stopping secondary CPUs [ 5301.597053] Starting crashdump kernel... [ 5301.597057] Bye! After applying our patch, I didn't find any kernel panic errors. We've found a simple reproducer # echo 1 > /sys/kernel/debug/tracing/events/qdisc/qdisc_reset/enable # ip link add veth0 type veth peer name veth1 Error: Unknown device type. However, without our patch applied, I tested upstream 6.10.0-rc3 kernel using the qdisc_reset event and the ip command on my qemu virtual machine. This 2 commands makes always kernel panic. Linux version: 6.10.0-rc3 [ 0.000000] Linux version 6.10.0-rc3-00164-g44ef20baed8e-dirty (paran@fedora) (gcc (GCC) 14.1.1 20240522 (Red Hat 14.1.1-4), GNU ld version 2.41-34.fc40) #20 SMP PREEMPT Sat Jun 15 16:51:25 KST 2024 Kernel panic message: [ 615.236484] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP [ 615.237250] Dumping ftrace buffer: [ 615.237679] (ftrace buffer empty) [ 615.238097] Modules linked in: veth crct10dif_ce virtio_gpu virtio_dma_buf drm_shmem_helper drm_kms_helper zynqmp_fpga xilinx_can xilinx_spi xilinx_selectmap xilinx_core xilinx_pr_decoupler versal_fpga uvcvideo uvc videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videodev videobuf2_common mc usbnet deflate zstd ubifs ubi rcar_canfd rcar_can omap_mailbox ntb_msi_test ntb_hw_epf lattice_sysconfig_spi lattice_sysconfig ice40_spi gpio_xilinx dwmac_altr_socfpga mdio_regmap stmmac_platform stmmac pcs_xpcs dfl_fme_region dfl_fme_mgr dfl_fme_br dfl_afu dfl fpga_region fpga_bridge can can_dev br_netfilter bridge stp llc atl1c ath11k_pci mhi ath11k_ahb ath11k qmi_helpers ath10k_sdio ath10k_pci ath10k_core ath mac80211 libarc4 cfg80211 drm fuse backlight ipv6 Jun 22 02:36:5[3 6k152.62-4sm98k4-0k]v kCePUr:n e1l :P IUDn:a b4le6 8t oC ohmma: nidpl eN oketr nteali nptaedg i6n.g1 0re.0q-urecs3t- 0at0 1v6i4r-tgu4a4le fa2d0dbraeeds0se-dir tyd f#f2f08 615.252376] Hardware name: linux,dummy-virt (DT) [ 615.253220] pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 615.254433] pc : strnlen+0x6c/0xe0 [ 615.255096] lr : trace_event_get_offsets_qdisc_reset+0x94/0x3d0 [ 615.256088] sp : ffff800080b269a0 [ 615.256615] x29: ffff800080b269a0 x28: ffffc070f3f98500 x27: 0000000000000001 [ 615.257831] x26: 0000000000000010 x25: ffffc070f3f98540 x24: ffffc070f619cf60 [ 615.259020] x23: 0000000000000128 x22: 0000000000000138 x21: dfff800000000000 [ 615.260241] x20: ffffc070f631ad00 x19: 0000000000000128 x18: ffffc070f448b800 [ 615.261454] x17: 0000000000000000 x16: 0000000000000001 x15: ffffc070f4ba2a90 [ 615.262635] x14: ffff700010164d73 x13: 1ffff80e1e8d5eb3 x12: 1ffff00010164d72 [ 615.263877] x11: ffff700010164d72 x10: dfff800000000000 x9 : ffffc070e85d6184 [ 615.265047] x8 : ffffc070e4402070 x7 : 000000000000f1f1 x6 : 000000001504a6d3 [ 615.266336] x5 : ffff28ca21122140 x4 : ffffc070f5043ea8 x3 : 0000000000000000 [ 615.267528] x2 : 0000000000000025 x1 : 0000000000000000 x0 : 0000000000000000 [ 615.268747] Call trace: [ 615.269180] strnlen+0x6c/0xe0 [ 615.269767] trace_event_get_offsets_qdisc_reset+0x94/0x3d0 [ 615.270716] trace_event_raw_event_qdisc_reset+0xe8/0x4e8 [ 615.271667] __traceiter_qdisc_reset+0xa0/0x140 [ 615.272499] qdisc_reset+0x554/0x848 [ 615.273134] netif_set_real_num_tx_queues+0x360/0x9a8 [ 615.274050] veth_init_queues+0x110/0x220 [veth] [ 615.275110] veth_newlink+0x538/0xa50 [veth] [ 615.276172] __rtnl_newlink+0x11e4/0x1bc8 [ 615.276944] rtnl_newlink+0xac/0x120 [ 615.277657] rtnetlink_rcv_msg+0x4e4/0x1370 [ 615.278409] netlink_rcv_skb+0x25c/0x4f0 [ 615.279122] rtnetlink_rcv+0x48/0x70 [ 615.279769] netlink_unicast+0x5a8/0x7b8 [ 615.280462] netlink_sendmsg+0xa70/0x1190 Yeoreum and I don't know if the patch we wrote will fix the underlying cause, but we think that priority is to prevent kernel panic happening. So, we're sending this patch. Fixes: 51270d573a8d ("tracing/net_sched: Fix tracepoints that save qdisc_dev() as a string") Link: https://lore.kernel.org/lkml/20240229143432.273b4871@gandalf.local.home/t/ Cc: netdev@vger.kernel.org Tested-by: Yunseong Kim <yskelg@gmail.com> Signed-off-by: Yunseong Kim <yskelg@gmail.com> Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com> Link: https://lore.kernel.org/r/20240624173320.24945-4-yskelg@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-27Merge tag 'amd-drm-fixes-6.10-2024-06-26' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.10-2024-06-26: amdgpu: - SMU 14.x fix - vram info parsing fix - mode1 reset fix - LTTPR fix - Virtual display fix - Avoid spurious error in PSP init Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240626221408.2019633-1-alexander.deucher@amd.com
2024-06-27Merge tag 'drm-misc-fixes-2024-06-26' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes drm-misc-fixes for v6.10-rc6: - nouveau tv mode fixes. - Add KOE TX26D202VM0BWA timings. - Fix fb_info when vmalloc is used, regression from CONFIG_DRM_FBDEV_LEAK_PHYS_SMEM. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/2e596c1e-9389-43c2-a029-06fe741c44c3@linux.intel.com
2024-06-26Merge tag 'mm-hotfixes-stable-2024-06-26-17-28' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "13 hotfixes, 7 are cc:stable. All are MM related apart from a MAINTAINERS update. There is no identifiable theme here - just singleton patches in various places" * tag 'mm-hotfixes-stable-2024-06-26-17-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/memory: don't require head page for do_set_pmd() mm/page_alloc: Separate THP PCP into movable and non-movable categories nfs: drop the incorrect assertion in nfs_swap_rw() mm/migrate: make migrate_pages_batch() stats consistent MAINTAINERS: TPM DEVICE DRIVER: update the W-tag selftests/mm:fix test_prctl_fork_exec return failure mm: convert page type macros to enum ocfs2: fix DIO failure due to insufficient transaction credits kasan: fix bad call to unpoison_slab_object mm: handle profiling for fake memory allocations during compaction mm/slab: fix 'variable obj_exts set but not used' warning /proc/pid/smaps: add mseal info for vma mm: fix incorrect vbq reference in purge_fragmented_block
2024-06-27netfilter: nf_tables: fully validate NFT_DATA_VALUE on store to data registersPablo Neira Ayuso
register store validation for NFT_DATA_VALUE is conditional, however, the datatype is always either NFT_DATA_VALUE or NFT_DATA_VERDICT. This only requires a new helper function to infer the register type from the set datatype so this conditional check can be removed. Otherwise, pointer to chain object can be leaked through the registers. Fixes: 96518518cc41 ("netfilter: add nftables") Reported-by: Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-06-26Merge tag 'wq-for-6.10-rc5-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue fixes from Tejun Heo: "Two patches to fix kworker name formatting" * tag 'wq-for-6.10-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: Increase worker desc's length to 32 workqueue: Refactor worker ID formatting and make wq_worker_comm() use full ID string
2024-06-26Merge tag 'asoc-fix-v6.10-rc5' of ↵Takashi Iwai
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v6.10 A relatively large batch of updates, largely due to the long interval since I last sent fixes due to various travel and holidays. There's a lot of driver specific fixes and quirks in here, none of them too major, and also some fixes for recently introduced memory safety issues in the topology code.
2024-06-26nvmet-fc: Remove __counted_by from nvmet_fc_tgt_queue.fod[]Nathan Chancellor
Work for __counted_by on generic pointers in structures (not just flexible array members) has started landing in Clang 19 (current tip of tree). During the development of this feature, a restriction was added to __counted_by to prevent the flexible array member's element type from including a flexible array member itself such as: struct foo { int count; char buf[]; }; struct bar { int count; struct foo data[] __counted_by(count); }; because the size of data cannot be calculated with the standard array size formula: sizeof(struct foo) * count This restriction was downgraded to a warning but due to CONFIG_WERROR, it can still break the build. The application of __counted_by on the fod member of 'struct nvmet_fc_tgt_queue' triggers this restriction, resulting in: drivers/nvme/target/fc.c:151:2: error: 'counted_by' should not be applied to an array with element of unknown size because 'struct nvmet_fc_fcp_iod' is a struct type with a flexible array member. This will be an error in a future compiler version [-Werror,-Wbounds-safety-counted-by-elt-type-unknown-size] 151 | struct nvmet_fc_fcp_iod fod[] __counted_by(sqsize); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. Remove this use of __counted_by to fix the warning/error. However, rather than remove it altogether, leave it commented, as it may be possible to support this in future compiler releases. Cc: stable@vger.kernel.org Closes: https://github.com/ClangBuiltLinux/linux/issues/2027 Fixes: ccd3129aca28 ("nvmet-fc: Annotate struct nvmet_fc_tgt_queue with __counted_by") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-06-26ASoC: rt5645: fix issue of random interrupt from push-buttonJack Yu
Modify register setting sequence of enabling inline command to fix issue of random interrupt from push-button. Signed-off-by: Jack Yu <jack.yu@realtek.com> Link: https://patch.msgid.link/9a7a3a66cbcb426487ca6f558f45e922@realtek.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-06-26ALSA: seq: Fix missing MSB in MIDI2 SPP conversionTakashi Iwai
The conversion of SPP to MIDI2 UMP called a wrong function, and the secondary argument wasn't taken. As a result, MSB of SPP was always zero. Fix to call the right function. Fixes: e9e02819a98a ("ALSA: seq: Automatic conversion of UMP events") Link: https://patch.msgid.link/20240626145141.16648-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-06-26Merge patch "riscv: stacktrace: convert arch_stack_walk() to noinstr"Palmer Dabbelt
This first patch in the larger series is a fix, so I'm merging it into fixes while the rest of the patch set is still under development. * b4-shazam-merge: riscv: stacktrace: convert arch_stack_walk() to noinstr Link: https://lore.kernel.org/r/20240613-dev-andyc-dyn-ftrace-v4-v1-0-1a538e12c01e@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26riscv: stacktrace: convert arch_stack_walk() to noinstrAndy Chiu
arch_stack_walk() is called intensively in function_graph when the kernel is compiled with CONFIG_TRACE_IRQFLAGS. As a result, the kernel logs a lot of arch_stack_walk and its sub-functions into the ftrace buffer. However, these functions should not appear on the trace log because they are part of the ftrace itself. This patch references what arm64 does for the smae function. So it further prevent the re-enter kprobe issue, which is also possible on riscv. Related-to: commit 0fbcd8abf337 ("arm64: Prohibit instrumentation on arch_stack_walk()") Fixes: 680341382da5 ("riscv: add CALLER_ADDRx support") Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240613-dev-andyc-dyn-ftrace-v4-v1-1-1a538e12c01e@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26riscv: patch: Flush the icache right after patching to avoid illegal insnsAlexandre Ghiti
We cannot delay the icache flush after patching some functions as we may have patched a function that will get called before the icache flush. The only way to completely avoid such scenario is by flushing the icache as soon as we patch a function. This will probably be costly as we don't batch the icache maintenance anymore. Fixes: 6ca445d8af0e ("riscv: Fix early ftrace nop patching") Reported-by: Conor Dooley <conor.dooley@microchip.com> Closes: https://lore.kernel.org/linux-riscv/20240613-lubricant-breath-061192a9489a@wendy/ Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Andy Chiu <andy.chiu@sifive.com> Link: https://lore.kernel.org/r/20240624082141.153871-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26net: usb: qmi_wwan: add Telit FN912 compositionsDaniele Palmas
Add the following Telit FN912 compositions: 0x3000: rmnet + tty (AT/NMEA) + tty (AT) + tty (diag) T: Bus=03 Lev=01 Prnt=03 Port=07 Cnt=01 Dev#= 8 Spd=480 MxCh= 0 D: Ver= 2.01 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=3000 Rev=05.15 S: Manufacturer=Telit Cinterion S: Product=FN912 S: SerialNumber=92c4c4d8 C: #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms 0x3001: rmnet + tty (AT) + tty (diag) + DPL (data packet logging) + adb T: Bus=03 Lev=01 Prnt=03 Port=07 Cnt=01 Dev#= 7 Spd=480 MxCh= 0 D: Ver= 2.01 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=3001 Rev=05.15 S: Manufacturer=Telit Cinterion S: Product=FN912 S: SerialNumber=92c4c4d8 C: #Ifs= 5 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 3 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none) E: Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=usbfs E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Link: https://patch.msgid.link/20240625102236.69539-1-dnlplm@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-26ASoC: amd: yc: Fix non-functional mic on ASUS M5602RAVyacheslav Frantsishko
The Vivobook S 16X IPS needs a quirks-table entry for the internal microphone to function properly. Signed-off-by: Vyacheslav Frantsishko <itmymaill@gmail.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patch.msgid.link/20240626070334.45633-1-itmymaill@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-06-26gpio: graniterapids: Add missing raw_spinlock_init()Aapo Vienamo
Add the missing raw_spin_lock_init() call to gnr_gpio_probe(). Fixes: ecc4b1418e23 ("gpio: Add Intel Granite Rapids-D vGPIO driver") Signed-off-by: Aapo Vienamo <aapo.vienamo@linux.intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Link: https://lore.kernel.org/r/20240625135343.673745-1-aapo.vienamo@linux.intel.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-06-26ALSA: hda/realtek: fix mute/micmute LEDs don't work for EliteBook 645/665 G11.Dirk Su
HP EliteBook 645/665 G11 needs ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF quirk to make mic-mute/audio-mute working. Signed-off-by: Dirk Su <dirk.su@canonical.com> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20240626021437.77039-1-dirk.su@canonical.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-06-26ALSA: hda/realtek: Fix conflicting quirk for PCI SSID 17aa:3820Takashi Iwai
The recent fix for Lenovo IdeaPad 330-17IKB replaced the quirk entry, and this eventually breaks the existing quirk for Lenovo Yoga Duet 7 13ITL6 equipped with the same PCI SSID 17aa:3820. For applying a proper quirk for each model, check the codec SSID additionally. Fortunately Yoga Duet has a different codec SSID, 0x17aa3802. (Interestingly, 17aa:3802 has another conflict of SSID between another Yoga model vs 14IRP8 which we had to work around similarly.) Fixes: b1fd0d1285b1 ("ALSA: hda/realtek: Enable headset mic on IdeaPad 330-17IKB 81DM") Link: https://patch.msgid.link/20240625155217.18767-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-06-25bcachefs: Fix kmalloc bug in __snapshot_t_mutPei Li
When allocating too huge a snapshot table, we should fail gracefully in __snapshot_t_mut() instead of fail in kmalloc(). Reported-by: syzbot+770e99b65e26fa023ab1@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=770e99b65e26fa023ab1 Tested-by: syzbot+770e99b65e26fa023ab1@syzkaller.appspotmail.com Signed-off-by: Pei Li <peili.dev@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-25tcp: fix tcp_rcv_fastopen_synack() to enter TCP_CA_Loss for failed TFONeal Cardwell
Testing determined that the recent commit 9e046bb111f1 ("tcp: clear tp->retrans_stamp in tcp_rcv_fastopen_synack()") has a race, and does not always ensure retrans_stamp is 0 after a TFO payload retransmit. If transmit completion for the SYN+data skb happens after the client TCP stack receives the SYNACK (which sometimes happens), then retrans_stamp can erroneously remain non-zero for the lifetime of the connection, causing a premature ETIMEDOUT later. Testing and tracing showed that the buggy scenario is the following somewhat tricky sequence: + Client attempts a TFO handshake. tcp_send_syn_data() sends SYN + TFO cookie + data in a single packet in the syn_data skb. It hands the syn_data skb to tcp_transmit_skb(), which makes a clone. Crucially, it then reuses the same original (non-clone) syn_data skb, transforming it by advancing the seq by one byte and removing the FIN bit, and enques the resulting payload-only skb in the sk->tcp_rtx_queue. + Client sets retrans_stamp to the start time of the three-way handshake. + Cookie mismatches or server has TFO disabled, and server only ACKs SYN. + tcp_ack() sees SYN is acked, tcp_clean_rtx_queue() clears retrans_stamp. + Since the client SYN was acked but not the payload, the TFO failure code path in tcp_rcv_fastopen_synack() tries to retransmit the payload skb. However, in some cases the transmit completion for the clone of the syn_data (which had SYN + TFO cookie + data) hasn't happened. In those cases, skb_still_in_host_queue() returns true for the retransmitted TFO payload, because the clone of the syn_data skb has not had its tx completetion. + Because skb_still_in_host_queue() finds skb_fclone_busy() is true, it sets the TSQ_THROTTLED bit and the retransmit does not happen in the tcp_rcv_fastopen_synack() call chain. + The tcp_rcv_fastopen_synack() code next implicitly assumes the retransmit process is finished, and sets retrans_stamp to 0 to clear it, but this is later overwritten (see below). + Later, upon tx completion, tcp_tsq_write() calls tcp_xmit_retransmit_queue(), which puts the retransmit in flight and sets retrans_stamp to a non-zero value. + The client receives an ACK for the retransmitted TFO payload data. + Since we're in CA_Open and there are no dupacks/SACKs/DSACKs/ECN to make tcp_ack_is_dubious() true and make us call tcp_fastretrans_alert() and reach a code path that clears retrans_stamp, retrans_stamp stays nonzero. + Later, if there is a TLP, RTO, RTO sequence, then the connection will suffer an early ETIMEDOUT due to the erroneously ancient retrans_stamp. The fix: this commit refactors the code to have tcp_rcv_fastopen_synack() retransmit by reusing the relevant parts of tcp_simple_retransmit() that enter CA_Loss (without changing cwnd) and call tcp_xmit_retransmit_queue(). We have tcp_simple_retransmit() and tcp_rcv_fastopen_synack() share code in this way because in both cases we get a packet indicating non-congestion loss (MTU reduction or TFO failure) and thus in both cases we want to retransmit as many packets as cwnd allows, without reducing cwnd. And given that retransmits will set retrans_stamp to a non-zero value (and may do so in a later calling context due to TSQ), we also want to enter CA_Loss so that we track when all retransmitted packets are ACked and clear retrans_stamp when that happens (to ensure later recurring RTOs are using the correct retrans_stamp and don't declare ETIMEDOUT prematurely). Fixes: 9e046bb111f1 ("tcp: clear tp->retrans_stamp in tcp_rcv_fastopen_synack()") Fixes: a7abf3cd76e1 ("tcp: consider using standard rtx logic in tcp_rcv_fastopen_synack()") Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Link: https://patch.msgid.link/20240624144323.2371403-1-ncardwell.sw@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-25ionic: use dev_consume_skb_any outside of napiShannon Nelson
If we're not in a NAPI softirq context, we need to be careful about how we call napi_consume_skb(), specifically we need to call it with budget==0 to signal to it that we're not in a safe context. This was found while running some configuration stress testing of traffic and a change queue config loop running, and this curious note popped out: [ 4371.402645] BUG: using smp_processor_id() in preemptible [00000000] code: ethtool/20545 [ 4371.402897] caller is napi_skb_cache_put+0x16/0x80 [ 4371.403120] CPU: 25 PID: 20545 Comm: ethtool Kdump: loaded Tainted: G OE 6.10.0-rc3-netnext+ #8 [ 4371.403302] Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 01/23/2021 [ 4371.403460] Call Trace: [ 4371.403613] <TASK> [ 4371.403758] dump_stack_lvl+0x4f/0x70 [ 4371.403904] check_preemption_disabled+0xc1/0xe0 [ 4371.404051] napi_skb_cache_put+0x16/0x80 [ 4371.404199] ionic_tx_clean+0x18a/0x240 [ionic] [ 4371.404354] ionic_tx_cq_service+0xc4/0x200 [ionic] [ 4371.404505] ionic_tx_flush+0x15/0x70 [ionic] [ 4371.404653] ? ionic_lif_qcq_deinit.isra.23+0x5b/0x70 [ionic] [ 4371.404805] ionic_txrx_deinit+0x71/0x190 [ionic] [ 4371.404956] ionic_reconfigure_queues+0x5f5/0xff0 [ionic] [ 4371.405111] ionic_set_ringparam+0x2e8/0x3e0 [ionic] [ 4371.405265] ethnl_set_rings+0x1f1/0x300 [ 4371.405418] ethnl_default_set_doit+0xbb/0x160 [ 4371.405571] genl_family_rcv_msg_doit+0xff/0x130 [...] I found that ionic_tx_clean() calls napi_consume_skb() which calls napi_skb_cache_put(), but before that last call is the note /* Zero budget indicate non-NAPI context called us, like netpoll */ and DEBUG_NET_WARN_ON_ONCE(!in_softirq()); Those are pretty big hints that we're doing it wrong. We can pass a context hint down through the calls to let ionic_tx_clean() know what we're doing so it can call napi_consume_skb() correctly. Fixes: 386e69865311 ("ionic: Make use napi_consume_skb") Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Link: https://patch.msgid.link/20240624175015.4520-1-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-25bcachefs: Discard, invalidate workers are now per deviceKent Overstreet
There's no reason for discards to be single threaded across all devices; this will improve performance on multi device setups. Additionally, making them per-device simplifies the refcounting on bch_dev->io_ref; we now hold it for the duration that the discard path is running, which fixes a race between the discard path and device removal. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-25bcachefs: Fix shift-out-of-bounds in bch2_blacklist_entries_gcPei Li
This series fix the shift-out-of-bounds issue in bch2_blacklist_entries_gc(). Instead of passing 0 to eytzinger0_first() when iterating the entries, we explicitly check 0 and initialize i to be 0. syzbot has tested the proposed patch and the reproducer did not trigger any issue: Reported-and-tested-by: syzbot+835d255ad6bc7f29ee12@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=835d255ad6bc7f29ee12 Signed-off-by: Pei Li <peili.dev@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-25bcachefs: slab-use-after-free Read in bch2_sb_errors_from_cpuPei Li
Acquire fsck_error_counts_lock before accessing the critical section protected by this lock. syzbot has tested the proposed patch and the reproducer did not trigger any issue. Reported-by: syzbot+a2bc0e838efd7663f4d9@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a2bc0e838efd7663f4d9 Signed-off-by: Pei Li <peili.dev@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-25drm/nouveau/dispnv04: fix null pointer dereference in nv17_tv_get_ld_modesMa Ke
In nv17_tv_get_ld_modes(), the return value of drm_mode_duplicate() is assigned to mode, which will lead to a possible NULL pointer dereference on failure of drm_mode_duplicate(). Add a check to avoid npd. Cc: stable@vger.kernel.org Signed-off-by: Ma Ke <make24@iscas.ac.cn> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240625081828.2620794-1-make24@iscas.ac.cn
2024-06-25drm/nouveau/dispnv04: fix null pointer dereference in nv17_tv_get_hd_modesMa Ke
In nv17_tv_get_hd_modes(), the return value of drm_mode_duplicate() is assigned to mode, which will lead to a possible NULL pointer dereference on failure of drm_mode_duplicate(). The same applies to drm_cvt_mode(). Add a check to avoid null pointer dereference. Cc: stable@vger.kernel.org Signed-off-by: Ma Ke <make24@iscas.ac.cn> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240625081029.2619437-1-make24@iscas.ac.cn
2024-06-25drm/amdgpu: Don't show false warning for reg listLijo Lazar
If reg list is already loaded on PSP 13.0.2 SOCs, psp will give TEE_ERR_CANCEL response on second time load. Avoid printing warn message for it. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-25drm/amdgpu: avoid using null object of framebufferJulia Zhang
Instead of using state->fb->obj[0] directly, get object from framebuffer by calling drm_gem_fb_get_obj() and return error code when object is null to avoid using null object of framebuffer. Reported-by: Fusheng Huang <fusheng.huang@ecarxgroup.com> Signed-off-by: Julia Zhang <Julia.Zhang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-06-25drm/amd/display: Send DP_TOTAL_LTTPR_CNT during detection if LTTPR is presentMichael Strauss
[WHY] New register field added in DP2.1 SCR, needed for auxless ALPM [HOW] Echo value read from 0xF0007 back to sink Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Michael Strauss <michael.strauss@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-25drm/amdgpu: Fix pci state save during mode-1 resetLijo Lazar
Cache the PCI state before bus master is disabled. The saved state is later used for other cases like restoring config space after mode-2 reset. Fixes: 5c03e5843e6b ("drm/amdgpu:add smu mode1/2 support for aldebaran") Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-25drm/amdgpu/atomfirmware: fix parsing of vram_infoAlex Deucher
v3.x changed the how vram width was encoded. The previous implementation actually worked correctly for most boards. Fix the implementation to work correctly everywhere. This fixes the vram width reported in the kernel log on some boards. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2024-06-25drm/amd/swsmu: add MALL init support workaround for smu_v14_0_1Li Ma
[Why] SMU firmware has not supported MALL PG. [How] Disable MALL PG and make it always on until SMU firmware is ready. Signed-off-by: Li Ma <li.ma@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-25RISC-V: fix vector insn load/store width maskJesse Taube
RVFDQ_FL_FS_WIDTH_MASK should be 3 bits [14-12], shifted down by 12 bits. Replace GENMASK(3, 0) with GENMASK(2, 0). Fixes: cd054837243b ("riscv: Allocate user's vector context in the first-use trap") Signed-off-by: Jesse Taube <jesse@rivosinc.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20240606182800.415831-1-jesse@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-25Revert "nfsd: fix oops when reading pool_stats before server is started"NeilBrown
This reverts commit 8e948c365d9c10b685d1deb946bd833d6a9b43e0. The reverted commit moves a test on a field protected by a mutex outside of the protection of that mutex, and so is obviously racey. Depending on how the race goes, si->serv might be NULL when dereferenced in svc_pool_stats_start(), or svc_pool_stats_stop() might unlock a mutex that hadn't been locked. This bug that the commit tried to fix has been addressed by initialising ->mutex earlier. Fixes: 8e948c365d9c ("nfsd: fix oops when reading pool_stats before server is started") Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-06-25nfsd: initialise nfsd_info.mutex early.NeilBrown
nfsd_info.mutex can be dereferenced by svc_pool_stats_start() immediately after the new netns is created. Currently this can trigger an oops. Move the initialisation earlier before it can possibly be dereferenced. Fixes: 7b207ccd9833 ("svc: don't hold reference for poolstats, only mutex.") Reported-by: Sourabh Jain <sourabhjain@linux.ibm.com> Closes: https://lore.kernel.org/all/c2e9f6de-1ec4-4d3a-b18d-d5a6ec0814a0@linux.ibm.com/ Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-06-25linux/syscalls.h: add missing __user annotationsArnd Bergmann
A couple of declarations in linux/syscalls.h are missing __user annotations on their pointers, which can lead to warnings from sparse because these don't match the implementation that have the correct address space annotations. Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-25syscalls: mmap(): use unsigned offset type consistentlyArnd Bergmann
Most architectures that implement the old-style mmap() with byte offset use 'unsigned long' as the type for that offset, but microblaze and riscv have the off_t type that is shared with userspace, matching the prototype in include/asm-generic/syscalls.h. Make this consistent by using an unsigned argument everywhere. This changes the behavior slightly, as the argument is shifted to a page number, and an user input with the top bit set would result in a negative page offset rather than a large one as we use elsewhere. For riscv, the 32-bit sys_mmap2() definition actually used a custom type that is different from the global declaration, but this was missed due to an incorrect type check. Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-25s390: remove native mmap2() syscallArnd Bergmann
The mmap2() syscall has never been used on 64-bit s390x and should have been removed as part of 5a79859ae0f3 ("s390: remove 31 bit support"). Remove it now. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-25hexagon: fix fadvise64_64 calling conventionsArnd Bergmann
fadvise64_64() has two 64-bit arguments at the wrong alignment for hexagon, which turns them into a 7-argument syscall that is not supported by Linux. The downstream musl port for hexagon actually asks for a 6-argument version the same way we do it on arm, csky, powerpc, so make the kernel do it the same way to avoid having to change both. Link: https://github.com/quic/musl/blob/hexagon/arch/hexagon/syscall_arch.h#L78 Cc: stable@vger.kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-25csky, hexagon: fix broken sys_sync_file_rangeArnd Bergmann
Both of these architectures require u64 function arguments to be passed in even/odd pairs of registers or stack slots, which in case of sync_file_range would result in a seven-argument system call that is not currently possible. The system call is therefore incompatible with all existing binaries. While it would be possible to implement support for seven arguments like on mips, it seems better to use a six-argument version, either with the normal argument order but misaligned as on most architectures or with the reordered sync_file_range2() calling conventions as on arm and powerpc. Cc: stable@vger.kernel.org Acked-by: Guo Ren <guoren@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-25sh: rework sync_file_range ABIArnd Bergmann
The unusual function calling conventions on SuperH ended up causing sync_file_range to have the wrong argument order, with the 'flags' argument getting sorted before 'nbytes' by the compiler. In userspace, I found that musl, glibc, uclibc and strace all expect the normal calling conventions with 'nbytes' last, so changing the kernel to match them should make all of those work. In order to be able to also fix libc implementations to work with existing kernels, they need to be able to tell which ABI is used. An easy way to do this is to add yet another system call using the sync_file_range2 ABI that works the same on all architectures. Old user binaries can now work on new kernels, and new binaries can try the new sync_file_range2() to work with new kernels or fall back to the old sync_file_range() version if that doesn't exist. Cc: stable@vger.kernel.org Fixes: 75c92acdd5b1 ("sh: Wire up new syscalls.") Acked-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-25powerpc: restore some missing spu syscallsArnd Bergmann
A couple of system calls were inadventently removed from the table during a bugfix for 32-bit powerpc entry. Restore the original behavior. Fixes: e23750623835 ("powerpc/32: fix syscall wrappers with 64-bit arguments of unaligned register-pairs") Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-25parisc: use generic sys_fanotify_mark implementationArnd Bergmann
The sys_fanotify_mark() syscall on parisc uses the reverse word order for the two halves of the 64-bit argument compared to all syscalls on all 32-bit architectures. As far as I can tell, the problem is that the function arguments on parisc are sorted backwards (26, 25, 24, 23, ...) compared to everyone else, so the calling conventions of using an even/odd register pair in native word order result in the lower word coming first in function arguments, matching the expected behavior on little-endian architectures. The system call conventions however ended up matching what the other 32-bit architectures do. A glibc cleanup in 2020 changed the userspace behavior in a way that handles all architectures consistently, but this inadvertently broke parisc32 by changing to the same method as everyone else. The change made it into glibc-2.35 and subsequently into debian 12 (bookworm), which is the latest stable release. This means we need to choose between reverting the glibc change or changing the kernel to match it again, but either hange will leave some systems broken. Pick the option that is more likely to help current and future users and change the kernel to match current glibc. This also means the behavior is now consistent across architectures, but it breaks running new kernels with old glibc builds before 2.35. Link: https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=d150181d73d9 Link: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/arch/parisc/kernel/sys_parisc.c?h=57b1dfbd5b4a39d Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org> Tested-by: Helge Deller <deller@gmx.de> Acked-by: Helge Deller <deller@gmx.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> --- I found this through code inspection, please double-check to make sure I got the bug and the fix right. The alternative is to fix this by reverting glibc back to the unusual behavior.
2024-06-25parisc: use correct compat recv/recvfrom syscallsArnd Bergmann
Johannes missed parisc back when he introduced the compat version of these syscalls, so receiving cmsg messages that require a compat conversion is still broken. Use the correct calls like the other architectures do. Fixes: 1dacc76d0014 ("net/compat/wext: send different messages to compat tasks") Acked-by: Helge Deller <deller@gmx.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-25sparc: fix compat recv/recvfrom syscallsArnd Bergmann
sparc has the wrong compat version of recv() and recvfrom() for both the direct syscalls and socketcall(). The direct syscalls just need to use the compat version. For socketcall, the same thing could be done, but it seems better to completely remove the custom assembler code for it and just use the same implementation that everyone else has. Fixes: 1dacc76d0014 ("net/compat/wext: send different messages to compat tasks") Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-25sparc: fix old compat_sys_select()Arnd Bergmann
sparc has two identical select syscalls at numbers 93 and 230, respectively. During the conversion to the modern syscall.tbl format, the older one of the two broke in compat mode, and now refers to the native 64-bit syscall. Restore the correct behavior. This has very little effect, as glibc has been using the newer number anyway. Fixes: 6ff645dd683a ("sparc: add system call table generation support") Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-25syscalls: fix compat_sys_io_pgetevents_time64 usageArnd Bergmann
Using sys_io_pgetevents() as the entry point for compat mode tasks works almost correctly, but misses the sign extension for the min_nr and nr arguments. This was addressed on parisc by switching to compat_sys_io_pgetevents_time64() in commit 6431e92fc827 ("parisc: io_pgetevents_time64() needs compat syscall in 32-bit compat mode"), as well as by using more sophisticated system call wrappers on x86 and s390. However, arm64, mips, powerpc, sparc and riscv still have the same bug. Change all of them over to use compat_sys_io_pgetevents_time64() like parisc already does. This was clearly the intention when the function was originally added, but it got hooked up incorrectly in the tables. Cc: stable@vger.kernel.org Fixes: 48166e6ea47d ("y2038: add 64-bit time_t syscalls to all 32-bit architectures") Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-25s390/boot: Do not adjust GOT entries for undef weak symJens Remus
Since commit 778666df60f0 ("s390: compile relocatable kernel without -fPIE") and commit 00cda11d3b2e ("s390: Compile kernel with -fPIC and link with -no-pie") the kernel on s390x may have a Global Offset Table (GOT) whose entries are adjusted for KASLR in kaslr_adjust_got(). The GOT may contain entries for undefined weak symbols that resolved to zero. That is the resulting GOT entry value is zero. Adjusting those entries unconditionally in kaslr_adjust_got() is wrong. Otherwise the following sample code would erroneously assume foo to be defined, due to the adjustment changing the zero-value to a non-zero one: extern int foo __attribute__((weak)); if (*foo) /* foo is defined [or undefined and erroneously adjusted] */ The vmlinux build at commit 00cda11d3b2e ("s390: Compile kernel with -fPIC and link with -no-pie") with defconfig actually had two GOT entries for the undefined weak symbols __start_BTF and __stop_BTF: $ objdump -tw vmlinux | grep -F "*UND*" 0000000000000000 w *UND* 0000000000000000 __stop_BTF 0000000000000000 w *UND* 0000000000000000 __start_BTF $ readelf -rw vmlinux | grep -E "R_390_GOTENT +0{16}" 000000345760 2776a0000001a R_390_GOTENT 0000000000000000 __stop_BTF + 2 000000345766 2d5480000001a R_390_GOTENT 0000000000000000 __start_BTF + 2 The s390-specific vmlinux linker script sets the section start to __START_KERNEL, which is currently defined as 0x100000 on s390x. Access to lowcore is performed via a pointer of 0 and not a symbol in a section starting at 0. The first 64K are reserved for the loader on s390x. Thus it is safe to assume that __START_KERNEL will never be 0. As a result there cannot be any defined symbols resolving to zero in the kernel. Note that the first three GOT entries are reserved for the dynamic loader on s390x. [1] In the kernel they are zero. Therefore no extra handling is required to skip these. Skip adjusting GOT entries with a value of zero in kaslr_adjust_got(). While at it update the comment when a GOT exists on s390x. Since commit 00cda11d3b2e ("s390: Compile kernel with -fPIC and link with -no-pie") it no longer only exists when compiling with Clang, but also with GCC. [1]: s390x ELF ABI, section "Global Offset Table", https://github.com/IBM/s390x-abi/releases Fixes: 778666df60f0 ("s390: compile relocatable kernel without -fPIE") Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-06-25thermal: gov_step_wise: Go straight to instance->lower when mitigation is overRafael J. Wysocki
Commit b6846826982b ("thermal: gov_step_wise: Restore passive polling management") attempted to fix a Step-Wise thermal governor issue introduced by commit 042a3d80f118 ("thermal: core: Move passive polling management to the core"), which caused the governor to leave cooling devices in high states, by partially reverting that commit. However, this turns out to be insufficient on some systems due to interactions between the governor code restored by commit b6846826982b and the passive polling management in the thermal core. For this reason, revert commit b6846826982b and make the governor set the target cooling device state to the "lower" one as soon as the zone temperature falls below the threshold of the trip point corresponding to the given thermal instance, which means that thermal mitigation is not necessary any more. Before this change the "lower" cooling device state would be reached in steps through the passive polling mechanism which was questionable for three reasons: (1) cooling device were kept in high states when that was not necessary (and it could adversely impact performance), (2) it only worked for thermal zones with nonzero passive_delay_jiffies value, and (3) passive polling belongs to the core and should not be hijacked by governors for their internal purposes. Fixes: b6846826982b ("thermal: gov_step_wise: Restore passive polling management") Closes: https://lore.kernel.org/linux-pm/6759ce9f-281d-4fcd-bb4c-b784a1cc5f6e@oldschoolsolutions.biz Reported-by: Jens Glathe <jens.glathe@oldschoolsolutions.biz> Tested-by: Jens Glathe <jens.glathe@oldschoolsolutions.biz> Link: https://patch.msgid.link/12464461.O9o76ZdvQC@rjwysocki.net Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Steev Klimaszewski <steev@kali.org> Tested-by: Johan Hovold <johan+linaro@kernel.org>