summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2023-12-07mm, pmem, xfs: Introduce MF_MEM_PRE_REMOVE for unbindShiyang Ruan
Now, if we suddenly remove a PMEM device(by calling unbind) which contains FSDAX while programs are still accessing data in this device, e.g.: ``` $FSSTRESS_PROG -d $SCRATCH_MNT -n 99999 -p 4 & # $FSX_PROG -N 1000000 -o 8192 -l 500000 $SCRATCH_MNT/t001 & echo "pfn1.1" > /sys/bus/nd/drivers/nd_pmem/unbind ``` it could come into an unacceptable state: 1. device has gone but mount point still exists, and umount will fail with "target is busy" 2. programs will hang and cannot be killed 3. may crash with NULL pointer dereference To fix this, we introduce a MF_MEM_PRE_REMOVE flag to let it know that we are going to remove the whole device, and make sure all related processes could be notified so that they could end up gracefully. This patch is inspired by Dan's "mm, dax, pmem: Introduce dev_pagemap_failure()"[1]. With the help of dax_holder and ->notify_failure() mechanism, the pmem driver is able to ask filesystem on it to unmap all files in use, and notify processes who are using those files. Call trace: trigger unbind -> unbind_store() -> ... (skip) -> devres_release_all() -> kill_dax() -> dax_holder_notify_failure(dax_dev, 0, U64_MAX, MF_MEM_PRE_REMOVE) -> xfs_dax_notify_failure() `-> freeze_super() // freeze (kernel call) `-> do xfs rmap ` -> mf_dax_kill_procs() ` -> collect_procs_fsdax() // all associated processes ` -> unmap_and_kill() ` -> invalidate_inode_pages2_range() // drop file's cache `-> thaw_super() // thaw (both kernel & user call) Introduce MF_MEM_PRE_REMOVE to let filesystem know this is a remove event. Use the exclusive freeze/thaw[2] to lock the filesystem to prevent new dax mapping from being created. Do not shutdown filesystem directly if configuration is not supported, or if failure range includes metadata area. Make sure all files and processes(not only the current progress) are handled correctly. Also drop the cache of associated files before pmem is removed. [1]: https://lore.kernel.org/linux-mm/161604050314.1463742.14151665140035795571.stgit@dwillia2-desk3.amr.corp.intel.com/ [2]: https://lore.kernel.org/linux-xfs/169116275623.3187159.16862410128731457358.stg-ugh@frogsfrogsfrogs/ Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-07PCI: add INTEL_HDA_ARL to pci_ids.hPierre-Louis Bossart
The PCI ID insertion follows the increasing order in the table, but this hardware follows MTL (MeteorLake). Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Kai Vehmanen <kai.vehmanen@linux.intel.com> Acked-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20231204212710.185976-2-pierre-louis.bossart@linux.intel.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-12-07drm/mipi-dsi: Fix detach call without attachTomi Valkeinen
It's been reported that DSI host driver's detach can be called without the attach ever happening: https://lore.kernel.org/all/20230412073954.20601-1-tony@atomide.com/ After reading the code, I think this is what happens: We have a DSI host defined in the device tree and a DSI peripheral under that host (i.e. an i2c device using the DSI as data bus doesn't exhibit this behavior). The host driver calls mipi_dsi_host_register(), which causes (via a few functions) mipi_dsi_device_add() to be called for the DSI peripheral. So now we have a DSI device under the host, but attach hasn't been called. Normally the probing of the devices continues, and eventually the DSI peripheral's driver will call mipi_dsi_attach(), attaching the peripheral. However, if the host driver's probe encounters an error after calling mipi_dsi_host_register(), and before the peripheral has called mipi_dsi_attach(), the host driver will do cleanups and return an error from its probe function. The cleanups include calling mipi_dsi_host_unregister(). mipi_dsi_host_unregister() will call two functions for all its DSI peripheral devices: mipi_dsi_detach() and mipi_dsi_device_unregister(). The latter makes sense, as the device exists, but the former may be wrong as attach has not necessarily been done. To fix this, track the attached state of the peripheral, and only detach from mipi_dsi_host_unregister() if the peripheral was attached. Note that I have only tested this with a board with an i2c DSI peripheral, not with a "pure" DSI peripheral. However, slightly related, the unregister machinery still seems broken. E.g. if the DSI host driver is unbound, it'll detach and unregister the DSI peripherals. After that, when the DSI peripheral driver unbound it'll call detach either directly or using the devm variant, leading to a crash. And probably the driver will crash if it happens, for some reason, to try to send a message via the DSI bus. But that's another topic. Tested-by: H. Nikolaus Schaller <hns@goldelico.com> Acked-by: Maxime Ripard <mripard@kernel.org> Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230921-dsi-detach-fix-v1-1-d0de2d1621d9@ideasonboard.com
2023-12-06ipv6: add debug checks in fib6_info_release()Eric Dumazet
Some elusive syzbot reports are hinting to fib6_info_release(), with a potential dangling f6i->gc_link anchor. Add debug checks so that syzbot can catch the issue earlier eventually. BUG: KASAN: slab-use-after-free in __hlist_del include/linux/list.h:990 [inline] BUG: KASAN: slab-use-after-free in hlist_del_init include/linux/list.h:1016 [inline] BUG: KASAN: slab-use-after-free in fib6_clean_expires_locked include/net/ip6_fib.h:533 [inline] BUG: KASAN: slab-use-after-free in fib6_purge_rt+0x986/0x9c0 net/ipv6/ip6_fib.c:1064 Write of size 8 at addr ffff88802805a840 by task syz-executor.1/10057 CPU: 1 PID: 10057 Comm: syz-executor.1 Not tainted 6.7.0-rc2-syzkaller-00029-g9b6de136b5f0 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:364 [inline] print_report+0xc4/0x620 mm/kasan/report.c:475 kasan_report+0xda/0x110 mm/kasan/report.c:588 __hlist_del include/linux/list.h:990 [inline] hlist_del_init include/linux/list.h:1016 [inline] fib6_clean_expires_locked include/net/ip6_fib.h:533 [inline] fib6_purge_rt+0x986/0x9c0 net/ipv6/ip6_fib.c:1064 fib6_del_route net/ipv6/ip6_fib.c:1993 [inline] fib6_del+0xa7a/0x1750 net/ipv6/ip6_fib.c:2038 __ip6_del_rt net/ipv6/route.c:3866 [inline] ip6_del_rt+0xf7/0x200 net/ipv6/route.c:3881 ndisc_router_discovery+0x295b/0x3560 net/ipv6/ndisc.c:1372 ndisc_rcv+0x3de/0x5f0 net/ipv6/ndisc.c:1856 icmpv6_rcv+0x1470/0x19c0 net/ipv6/icmp.c:979 ip6_protocol_deliver_rcu+0x170/0x13e0 net/ipv6/ip6_input.c:438 ip6_input_finish+0x14f/0x2f0 net/ipv6/ip6_input.c:483 NF_HOOK include/linux/netfilter.h:314 [inline] NF_HOOK include/linux/netfilter.h:308 [inline] ip6_input+0xa1/0xc0 net/ipv6/ip6_input.c:492 ip6_mc_input+0x48b/0xf40 net/ipv6/ip6_input.c:586 dst_input include/net/dst.h:461 [inline] ip6_rcv_finish net/ipv6/ip6_input.c:79 [inline] NF_HOOK include/linux/netfilter.h:314 [inline] NF_HOOK include/linux/netfilter.h:308 [inline] ipv6_rcv+0x24e/0x380 net/ipv6/ip6_input.c:310 __netif_receive_skb_one_core+0x115/0x180 net/core/dev.c:5529 __netif_receive_skb+0x1f/0x1b0 net/core/dev.c:5643 netif_receive_skb_internal net/core/dev.c:5729 [inline] netif_receive_skb+0x133/0x700 net/core/dev.c:5788 tun_rx_batched+0x429/0x780 drivers/net/tun.c:1579 tun_get_user+0x29e3/0x3bc0 drivers/net/tun.c:2002 tun_chr_write_iter+0xe8/0x210 drivers/net/tun.c:2048 call_write_iter include/linux/fs.h:2020 [inline] new_sync_write fs/read_write.c:491 [inline] vfs_write+0x64f/0xdf0 fs/read_write.c:584 ksys_write+0x12f/0x250 fs/read_write.c:637 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b RIP: 0033:0x7f38e387b82f Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 b9 80 02 00 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 0c 81 02 00 48 RSP: 002b:00007f38e45c9090 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00007f38e399bf80 RCX: 00007f38e387b82f RDX: 00000000000003b6 RSI: 0000000020000680 RDI: 00000000000000c8 RBP: 00007f38e38c847a R08: 0000000000000000 R09: 0000000000000000 R10: 00000000000003b6 R11: 0000000000000293 R12: 0000000000000000 R13: 000000000000000b R14: 00007f38e399bf80 R15: 00007f38e3abfa48 </TASK> Allocated by task 10044: kasan_save_stack+0x33/0x50 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 ____kasan_kmalloc mm/kasan/common.c:374 [inline] __kasan_kmalloc+0xa2/0xb0 mm/kasan/common.c:383 kasan_kmalloc include/linux/kasan.h:198 [inline] __do_kmalloc_node mm/slab_common.c:1007 [inline] __kmalloc+0x59/0x90 mm/slab_common.c:1020 kmalloc include/linux/slab.h:604 [inline] kzalloc include/linux/slab.h:721 [inline] fib6_info_alloc+0x40/0x160 net/ipv6/ip6_fib.c:155 ip6_route_info_create+0x337/0x1e70 net/ipv6/route.c:3749 ip6_route_add+0x26/0x150 net/ipv6/route.c:3843 rt6_add_route_info+0x2e7/0x4b0 net/ipv6/route.c:4316 rt6_route_rcv+0x76c/0xbf0 net/ipv6/route.c:985 ndisc_router_discovery+0x138b/0x3560 net/ipv6/ndisc.c:1529 ndisc_rcv+0x3de/0x5f0 net/ipv6/ndisc.c:1856 icmpv6_rcv+0x1470/0x19c0 net/ipv6/icmp.c:979 ip6_protocol_deliver_rcu+0x170/0x13e0 net/ipv6/ip6_input.c:438 ip6_input_finish+0x14f/0x2f0 net/ipv6/ip6_input.c:483 NF_HOOK include/linux/netfilter.h:314 [inline] NF_HOOK include/linux/netfilter.h:308 [inline] ip6_input+0xa1/0xc0 net/ipv6/ip6_input.c:492 ip6_mc_input+0x48b/0xf40 net/ipv6/ip6_input.c:586 dst_input include/net/dst.h:461 [inline] ip6_rcv_finish net/ipv6/ip6_input.c:79 [inline] NF_HOOK include/linux/netfilter.h:314 [inline] NF_HOOK include/linux/netfilter.h:308 [inline] ipv6_rcv+0x24e/0x380 net/ipv6/ip6_input.c:310 __netif_receive_skb_one_core+0x115/0x180 net/core/dev.c:5529 __netif_receive_skb+0x1f/0x1b0 net/core/dev.c:5643 netif_receive_skb_internal net/core/dev.c:5729 [inline] netif_receive_skb+0x133/0x700 net/core/dev.c:5788 tun_rx_batched+0x429/0x780 drivers/net/tun.c:1579 tun_get_user+0x29e3/0x3bc0 drivers/net/tun.c:2002 tun_chr_write_iter+0xe8/0x210 drivers/net/tun.c:2048 call_write_iter include/linux/fs.h:2020 [inline] new_sync_write fs/read_write.c:491 [inline] vfs_write+0x64f/0xdf0 fs/read_write.c:584 ksys_write+0x12f/0x250 fs/read_write.c:637 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b Freed by task 5123: kasan_save_stack+0x33/0x50 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 kasan_save_free_info+0x2b/0x40 mm/kasan/generic.c:522 ____kasan_slab_free mm/kasan/common.c:236 [inline] ____kasan_slab_free+0x15b/0x1b0 mm/kasan/common.c:200 kasan_slab_free include/linux/kasan.h:164 [inline] slab_free_hook mm/slub.c:1800 [inline] slab_free_freelist_hook+0x114/0x1e0 mm/slub.c:1826 slab_free mm/slub.c:3809 [inline] __kmem_cache_free+0xc0/0x180 mm/slub.c:3822 rcu_do_batch kernel/rcu/tree.c:2158 [inline] rcu_core+0x819/0x1680 kernel/rcu/tree.c:2431 __do_softirq+0x21a/0x8de kernel/softirq.c:553 Last potentially related work creation: kasan_save_stack+0x33/0x50 mm/kasan/common.c:45 __kasan_record_aux_stack+0xbc/0xd0 mm/kasan/generic.c:492 __call_rcu_common.constprop.0+0x9a/0x7a0 kernel/rcu/tree.c:2681 fib6_info_release include/net/ip6_fib.h:332 [inline] fib6_info_release include/net/ip6_fib.h:329 [inline] rt6_route_rcv+0xa4e/0xbf0 net/ipv6/route.c:997 ndisc_router_discovery+0x138b/0x3560 net/ipv6/ndisc.c:1529 ndisc_rcv+0x3de/0x5f0 net/ipv6/ndisc.c:1856 icmpv6_rcv+0x1470/0x19c0 net/ipv6/icmp.c:979 ip6_protocol_deliver_rcu+0x170/0x13e0 net/ipv6/ip6_input.c:438 ip6_input_finish+0x14f/0x2f0 net/ipv6/ip6_input.c:483 NF_HOOK include/linux/netfilter.h:314 [inline] NF_HOOK include/linux/netfilter.h:308 [inline] ip6_input+0xa1/0xc0 net/ipv6/ip6_input.c:492 ip6_mc_input+0x48b/0xf40 net/ipv6/ip6_input.c:586 dst_input include/net/dst.h:461 [inline] ip6_rcv_finish net/ipv6/ip6_input.c:79 [inline] NF_HOOK include/linux/netfilter.h:314 [inline] NF_HOOK include/linux/netfilter.h:308 [inline] ipv6_rcv+0x24e/0x380 net/ipv6/ip6_input.c:310 __netif_receive_skb_one_core+0x115/0x180 net/core/dev.c:5529 __netif_receive_skb+0x1f/0x1b0 net/core/dev.c:5643 netif_receive_skb_internal net/core/dev.c:5729 [inline] netif_receive_skb+0x133/0x700 net/core/dev.c:5788 tun_rx_batched+0x429/0x780 drivers/net/tun.c:1579 tun_get_user+0x29e3/0x3bc0 drivers/net/tun.c:2002 tun_chr_write_iter+0xe8/0x210 drivers/net/tun.c:2048 call_write_iter include/linux/fs.h:2020 [inline] new_sync_write fs/read_write.c:491 [inline] vfs_write+0x64f/0xdf0 fs/read_write.c:584 ksys_write+0x12f/0x250 fs/read_write.c:637 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b Second to last potentially related work creation: kasan_save_stack+0x33/0x50 mm/kasan/common.c:45 __kasan_record_aux_stack+0xbc/0xd0 mm/kasan/generic.c:492 insert_work+0x38/0x230 kernel/workqueue.c:1647 __queue_work+0xcdc/0x11f0 kernel/workqueue.c:1803 call_timer_fn+0x193/0x590 kernel/time/timer.c:1700 expire_timers kernel/time/timer.c:1746 [inline] __run_timers+0x585/0xb20 kernel/time/timer.c:2022 run_timer_softirq+0x58/0xd0 kernel/time/timer.c:2035 __do_softirq+0x21a/0x8de kernel/softirq.c:553 The buggy address belongs to the object at ffff88802805a800 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 64 bytes inside of freed 512-byte region [ffff88802805a800, ffff88802805aa00) The buggy address belongs to the physical page: page:ffffea0000a01600 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x28058 head:ffffea0000a01600 order:2 entire_mapcount:0 nr_pages_mapped:0 pincount:0 flags: 0xfff00000000840(slab|head|node=0|zone=1|lastcpupid=0x7ff) page_type: 0xffffffff() raw: 00fff00000000840 ffff888013041c80 ffffea0001e02600 dead000000000002 raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 2, migratetype Unmovable, gfp_mask 0x1d20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL), pid 18706, tgid 18699 (syz-executor.2), ts 999991973280, free_ts 996884464281 set_page_owner include/linux/page_owner.h:31 [inline] post_alloc_hook+0x2d0/0x350 mm/page_alloc.c:1537 prep_new_page mm/page_alloc.c:1544 [inline] get_page_from_freelist+0xa25/0x36d0 mm/page_alloc.c:3312 __alloc_pages+0x22e/0x2420 mm/page_alloc.c:4568 alloc_pages_mpol+0x258/0x5f0 mm/mempolicy.c:2133 alloc_slab_page mm/slub.c:1870 [inline] allocate_slab mm/slub.c:2017 [inline] new_slab+0x283/0x3c0 mm/slub.c:2070 ___slab_alloc+0x979/0x1500 mm/slub.c:3223 __slab_alloc.constprop.0+0x56/0xa0 mm/slub.c:3322 __slab_alloc_node mm/slub.c:3375 [inline] slab_alloc_node mm/slub.c:3468 [inline] __kmem_cache_alloc_node+0x131/0x310 mm/slub.c:3517 __do_kmalloc_node mm/slab_common.c:1006 [inline] __kmalloc+0x49/0x90 mm/slab_common.c:1020 kmalloc include/linux/slab.h:604 [inline] kzalloc include/linux/slab.h:721 [inline] copy_splice_read+0x1ac/0x8f0 fs/splice.c:338 vfs_splice_read fs/splice.c:992 [inline] vfs_splice_read+0x2ea/0x3b0 fs/splice.c:962 splice_direct_to_actor+0x2a5/0xa30 fs/splice.c:1069 do_splice_direct+0x1af/0x280 fs/splice.c:1194 do_sendfile+0xb3e/0x1310 fs/read_write.c:1254 __do_sys_sendfile64 fs/read_write.c:1322 [inline] __se_sys_sendfile64 fs/read_write.c:1308 [inline] __x64_sys_sendfile64+0x1d6/0x220 fs/read_write.c:1308 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82 page last free stack trace: reset_page_owner include/linux/page_owner.h:24 [inline] free_pages_prepare mm/page_alloc.c:1137 [inline] free_unref_page_prepare+0x4fa/0xaa0 mm/page_alloc.c:2347 free_unref_page_list+0xe6/0xb40 mm/page_alloc.c:2533 release_pages+0x32a/0x14f0 mm/swap.c:1042 tlb_batch_pages_flush+0x9a/0x190 mm/mmu_gather.c:98 tlb_flush_mmu_free mm/mmu_gather.c:293 [inline] tlb_flush_mmu mm/mmu_gather.c:300 [inline] tlb_finish_mmu+0x14b/0x6f0 mm/mmu_gather.c:392 exit_mmap+0x38b/0xa70 mm/mmap.c:3321 __mmput+0x12a/0x4d0 kernel/fork.c:1349 mmput+0x62/0x70 kernel/fork.c:1371 exit_mm kernel/exit.c:567 [inline] do_exit+0x9ad/0x2ae0 kernel/exit.c:858 do_group_exit+0xd4/0x2a0 kernel/exit.c:1021 get_signal+0x23be/0x2790 kernel/signal.c:2904 arch_do_signal_or_restart+0x90/0x7f0 arch/x86/kernel/signal.c:309 exit_to_user_mode_loop kernel/entry/common.c:168 [inline] exit_to_user_mode_prepare+0x121/0x240 kernel/entry/common.c:204 irqentry_exit_to_user_mode+0xa/0x40 kernel/entry/common.c:309 asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645 Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20231205173250.2982846-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-07firmware: zynqmp: Add support to handle IPI CRC failureJay Buddhabhatti
Added new PM error code XST_PM_INVALID_CRC to handle CRC validation failure during IPI communication. Co-developed-by: Naman Trivedi Manojbhai <naman.trivedimanojbhai@amd.com> Signed-off-by: Naman Trivedi Manojbhai <naman.trivedimanojbhai@amd.com> Signed-off-by: Jay Buddhabhatti <jay.buddhabhatti@amd.com> Link: https://lore.kernel.org/r/20231129112713.22718-6-jay.buddhabhatti@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-07drivers: soc: xilinx: Fix error message on SGI registration failureJay Buddhabhatti
Failure to register SGI for firmware event notification is non-fatal error when feature is not supported by other modules such as Xen and TF-A. Add _info level log message for such special case. Also add XST_PM_INVALID_VERSION error code and map it to -EOPNOSUPP Linux kernel error code. If feature is not supported or EEMI API version is mismatch, firmware can return XST_PM_INVALID_VERSION = 4 or XST_PM_NO_FEATURE = 19 error code. Co-developed-by: Tanmay Shah <tanmay.shah@amd.com> Signed-off-by: Tanmay Shah <tanmay.shah@amd.com> Signed-off-by: Jay Buddhabhatti <jay.buddhabhatti@amd.com> Link: https://lore.kernel.org/r/20231129112713.22718-5-jay.buddhabhatti@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-07firmware: xilinx: Expand feature check to support all PLM modulesJay Buddhabhatti
To support feature check for all modules, append the module id of the API that is being checked to the feature check API so it could be routed to the target module for processing. There is no need to check compatible string because the board information is taken via firmware interface. Co-developed-by: Saeed Nowshadi <saeed.nowshadi@amd.com> Signed-off-by: Saeed Nowshadi <saeed.nowshadi@amd.com> Signed-off-by: Jay Buddhabhatti <jay.buddhabhatti@amd.com> Link: https://lore.kernel.org/r/20231129112713.22718-3-jay.buddhabhatti@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-07firmware: xilinx: Update firmware call interface to support additional argsJay Buddhabhatti
System-level platform management layer (do_fw_call()) has support for maximum of 5 arguments as of now (1 EEMI API ID + 4 command arguments). In order to support new EEMI PM_IOCTL IDs (Secure Read/Write), this support must be extended to support one additional argument, which results in a configuration of - 1 EEMI API ID + 5 command arguments. Update zynqmp_pm_invoke_fn() and do_fw_call() with this new definition containing variable arguments. As a result, update all the references to pm invoke function with the updated definition. Co-developed-by: Izhar Ameer Shaikh <izhar.ameer.shaikh@amd.com> Signed-off-by: Izhar Ameer Shaikh <izhar.ameer.shaikh@amd.com> Signed-off-by: Jay Buddhabhatti <jay.buddhabhatti@amd.com> Link: https://lore.kernel.org/r/20231129112713.22718-2-jay.buddhabhatti@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-06bpf: Use arch_bpf_trampoline_sizeSong Liu
Instead of blindly allocating PAGE_SIZE for each trampoline, check the size of the trampoline with arch_bpf_trampoline_size(). This size is saved in bpf_tramp_image->size, and used for modmem charge/uncharge. The fallback arch_alloc_bpf_trampoline() still allocates a whole page because we need to use set_memory_* to protect the memory. struct_ops trampoline still uses a whole page for multiple trampolines. With this size check at caller (regular trampoline and struct_ops trampoline), remove arch_bpf_trampoline_size() from arch_prepare_bpf_trampoline() in archs. Also, update bpf_image_ksym_add() to handle symbol of different sizes. Signed-off-by: Song Liu <song@kernel.org> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> # on s390x Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> # on riscv Link: https://lore.kernel.org/r/20231206224054.492250-7-song@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06bpf: Add arch_bpf_trampoline_size()Song Liu
This helper will be used to calculate the size of the trampoline before allocating the memory. arch_prepare_bpf_trampoline() for arm64 and riscv64 can use arch_bpf_trampoline_size() to check the trampoline fits in the image. OTOH, arch_prepare_bpf_trampoline() for s390 has to call the JIT process twice, so it cannot use arch_bpf_trampoline_size(). Signed-off-by: Song Liu <song@kernel.org> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> # on s390x Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Björn Töpel <bjorn@rivosinc.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> # on riscv Link: https://lore.kernel.org/r/20231206224054.492250-6-song@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06bpf: Add helpers for trampoline image managementSong Liu
As BPF trampoline of different archs moves from bpf_jit_[alloc|free]_exec() to bpf_prog_pack_[alloc|free](), we need to use different _alloc, _free for different archs during the transition. Add the following helpers for this transition: void *arch_alloc_bpf_trampoline(unsigned int size); void arch_free_bpf_trampoline(void *image, unsigned int size); void arch_protect_bpf_trampoline(void *image, unsigned int size); void arch_unprotect_bpf_trampoline(void *image, unsigned int size); The fallback version of these helpers require size <= PAGE_SIZE, but they are only called with size == PAGE_SIZE. They will be called with size < PAGE_SIZE when arch_bpf_trampoline_size() helper is introduced later. Signed-off-by: Song Liu <song@kernel.org> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> # on s390x Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20231206224054.492250-4-song@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06bpf: Adjust argument names of arch_prepare_bpf_trampoline()Song Liu
We are using "im" for "struct bpf_tramp_image" and "tr" for "struct bpf_trampoline" in most of the code base. The only exception is the prototype and fallback version of arch_prepare_bpf_trampoline(). Update them to match the rest of the code base. We mix "orig_call" and "func_addr" for the argument in different versions of arch_prepare_bpf_trampoline(). s/orig_call/func_addr/g so they match. Signed-off-by: Song Liu <song@kernel.org> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> # on s390x Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20231206224054.492250-3-song@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06bpf: Let bpf_prog_pack_free handle any pointerSong Liu
Currently, bpf_prog_pack_free only can only free pointer to struct bpf_binary_header, which is not flexible. Add a size argument to bpf_prog_pack_free so that it can handle any pointer. Signed-off-by: Song Liu <song@kernel.org> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> # on s390x Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20231206224054.492250-2-song@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06Merge branch 'master' into mm-hotfixes-stableAndrew Morton
2023-12-06highmem: fix a memory copy problem in memcpy_from_folioSu Hui
Clang static checker complains that value stored to 'from' is never read. And memcpy_from_folio() only copy the last chunk memory from folio to destination. Use 'to += chunk' to replace 'from += chunk' to fix this typo problem. Link: https://lkml.kernel.org/r/20231130034017.1210429-1-suhui@nfschina.com Fixes: b23d03ef7af5 ("highmem: add memcpy_to_folio() and memcpy_from_folio()") Signed-off-by: Su Hui <suhui@nfschina.com> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jiaqi Yan <jiaqiyan@google.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Collingbourne <pcc@google.com> Cc: Tom Rix <trix@redhat.com> Cc: Tony Luck <tony.luck@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-06units: add missing headerAndy Shevchenko
BITS_PER_BYTE is defined in bits.h. Link: https://lkml.kernel.org/r/20231128174404.393393-1-andriy.shevchenko@linux.intel.com Fixes: e8eed5f7366f ("units: Add BYTES_PER_*BIT") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Cc: Damian Muszynski <damian.muszynski@intel.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-06hugetlb: fix null-ptr-deref in hugetlb_vma_lock_writeMike Kravetz
The routine __vma_private_lock tests for the existence of a reserve map associated with a private hugetlb mapping. A pointer to the reserve map is in vma->vm_private_data. __vma_private_lock was checking the pointer for NULL. However, it is possible that the low bits of the pointer could be used as flags. In such instances, vm_private_data is not NULL and not a valid pointer. This results in the null-ptr-deref reported by syzbot: general protection fault, probably for non-canonical address 0xdffffc000000001d: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x00000000000000e8-0x00000000000000ef] CPU: 0 PID: 5048 Comm: syz-executor139 Not tainted 6.6.0-rc7-syzkaller-00142-g88 8cf78c29e2 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 1 0/09/2023 RIP: 0010:__lock_acquire+0x109/0x5de0 kernel/locking/lockdep.c:5004 ... Call Trace: <TASK> lock_acquire kernel/locking/lockdep.c:5753 [inline] lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5718 down_write+0x93/0x200 kernel/locking/rwsem.c:1573 hugetlb_vma_lock_write mm/hugetlb.c:300 [inline] hugetlb_vma_lock_write+0xae/0x100 mm/hugetlb.c:291 __hugetlb_zap_begin+0x1e9/0x2b0 mm/hugetlb.c:5447 hugetlb_zap_begin include/linux/hugetlb.h:258 [inline] unmap_vmas+0x2f4/0x470 mm/memory.c:1733 exit_mmap+0x1ad/0xa60 mm/mmap.c:3230 __mmput+0x12a/0x4d0 kernel/fork.c:1349 mmput+0x62/0x70 kernel/fork.c:1371 exit_mm kernel/exit.c:567 [inline] do_exit+0x9ad/0x2a20 kernel/exit.c:861 __do_sys_exit kernel/exit.c:991 [inline] __se_sys_exit kernel/exit.c:989 [inline] __x64_sys_exit+0x42/0x50 kernel/exit.c:989 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Mask off low bit flags before checking for NULL pointer. In addition, the reserve map only 'belongs' to the OWNER (parent in parent/child relationships) so also check for the OWNER flag. Link: https://lkml.kernel.org/r/20231114012033.259600-1-mike.kravetz@oracle.com Reported-by: syzbot+6ada951e7c0f7bc8a71e@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-mm/00000000000078d1e00608d7878b@google.com/ Fixes: bf4916922c60 ("hugetlbfs: extend hugetlb_vma_lock to private VMAs") Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Rik van Riel <riel@surriel.com> Cc: Edward Adam Davis <eadavis@qq.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Tom Rix <trix@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-12-06bpf: rename MAX_BPF_LINK_TYPE into __MAX_BPF_LINK_TYPE for consistencyAndrii Nakryiko
To stay consistent with the naming pattern used for similar cases in BPF UAPI (__MAX_BPF_ATTACH_TYPE, etc), rename MAX_BPF_LINK_TYPE into __MAX_BPF_LINK_TYPE. Also similar to MAX_BPF_ATTACH_TYPE and MAX_BPF_REG, add: #define MAX_BPF_LINK_TYPE __MAX_BPF_LINK_TYPE Not all __MAX_xxx enums have such #define, so I'm not sure if we should add it or not, but I figured I'll start with a completely backwards compatible way, and we can drop that, if necessary. Also adjust a selftest that used MAX_BPF_LINK_TYPE enum. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20231206190920.1651226-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06bpf: Fix prog_array_map_poke_run map poke updateJiri Olsa
Lee pointed out issue found by syscaller [0] hitting BUG in prog array map poke update in prog_array_map_poke_run function due to error value returned from bpf_arch_text_poke function. There's race window where bpf_arch_text_poke can fail due to missing bpf program kallsym symbols, which is accounted for with check for -EINVAL in that BUG_ON call. The problem is that in such case we won't update the tail call jump and cause imbalance for the next tail call update check which will fail with -EBUSY in bpf_arch_text_poke. I'm hitting following race during the program load: CPU 0 CPU 1 bpf_prog_load bpf_check do_misc_fixups prog_array_map_poke_track map_update_elem bpf_fd_array_map_update_elem prog_array_map_poke_run bpf_arch_text_poke returns -EINVAL bpf_prog_kallsyms_add After bpf_arch_text_poke (CPU 1) fails to update the tail call jump, the next poke update fails on expected jump instruction check in bpf_arch_text_poke with -EBUSY and triggers the BUG_ON in prog_array_map_poke_run. Similar race exists on the program unload. Fixing this by moving the update to bpf_arch_poke_desc_update function which makes sure we call __bpf_arch_text_poke that skips the bpf address check. Each architecture has slightly different approach wrt looking up bpf address in bpf_arch_text_poke, so instead of splitting the function or adding new 'checkip' argument in previous version, it seems best to move the whole map_poke_run update as arch specific code. [0] https://syzkaller.appspot.com/bug?extid=97a4fe20470e9bc30810 Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT") Reported-by: syzbot+97a4fe20470e9bc30810@syzkaller.appspotmail.com Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yonghong.song@linux.dev> Cc: Lee Jones <lee@kernel.org> Cc: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20231206083041.1306660-2-jolsa@kernel.org
2023-12-06thermal: sysfs: Rework the handling of trip point updatesRafael J. Wysocki
Both trip_point_temp_store() and trip_point_hyst_store() use thermal_zone_set_trip() to update a given trip point, but none of them actually needs to change more than one field in struct thermal_trip representing it. However, each of them effectively calls __thermal_zone_get_trip() twice in a row for the same trip index value, once directly and once via thermal_zone_set_trip(), which is not particularly efficient, and the way in which thermal_zone_set_trip() carries out the update is not particularly straightforward. Moreover, input processing need not be done under the thermal zone lock in any of these functions. Rework trip_point_temp_store() and trip_point_hyst_store() to address the above, move the part of thermal_zone_set_trip() that is still useful to a new function called thermal_zone_trip_updated() and drop the rest of it. While at it, make trip_point_hyst_store() reject negative hysteresis values. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2023-12-06cgroup/cpuset: Include isolated cpuset CPUs in cpu_is_isolated() checkWaiman Long
Currently, the cpu_is_isolated() function checks only the statically isolated CPUs specified via the "isolcpus" and "nohz_full" kernel command line options. This function is used by vmstat and memcg to reduce interference with isolated CPUs by not doing stat flushing or scheduling works on those CPUs. Workloads running on isolated CPUs within isolated cpuset partitions should receive the same treatment to reduce unnecessary interference. This patch introduces a new cpuset_cpu_is_isolated() function to be called by cpu_is_isolated() so that the set of dynamically created cpuset isolated CPUs will be included in the check. Assuming that testing a bit in a cpumask is atomic, no synchronization primitive is currently used to synchronize access to the cpuset's isolated_cpus mask. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2023-12-06bpf,lsm: add BPF token LSM hooksAndrii Nakryiko
Wire up bpf_token_create and bpf_token_free LSM hooks, which allow to allocate LSM security blob (we add `void *security` field to struct bpf_token for that), but also control who can instantiate BPF token. This follows existing pattern for BPF map and BPF prog. Also add security_bpf_token_allow_cmd() and security_bpf_token_capable() LSM hooks that allow LSM implementation to control and negate (if necessary) BPF token's delegation of a specific bpf_cmd and capability, respectively. Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-12-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06bpf,lsm: refactor bpf_map_alloc/bpf_map_free LSM hooksAndrii Nakryiko
Similarly to bpf_prog_alloc LSM hook, rename and extend bpf_map_alloc hook into bpf_map_create, taking not just struct bpf_map, but also bpf_attr and bpf_token, to give a fuller context to LSMs. Unlike bpf_prog_alloc, there is no need to move the hook around, as it currently is firing right before allocating BPF map ID and FD, which seems to be a sweet spot. But like bpf_prog_alloc/bpf_prog_free combo, make sure that bpf_map_free LSM hook is called even if bpf_map_create hook returned error, as if few LSMs are combined together it could be that one LSM successfully allocated security blob for its needs, while subsequent LSM rejected BPF map creation. The former LSM would still need to free up LSM blob, so we need to ensure security_bpf_map_free() is called regardless of the outcome. Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-11-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06bpf,lsm: refactor bpf_prog_alloc/bpf_prog_free LSM hooksAndrii Nakryiko
Based on upstream discussion ([0]), rework existing bpf_prog_alloc_security LSM hook. Rename it to bpf_prog_load and instead of passing bpf_prog_aux, pass proper bpf_prog pointer for a full BPF program struct. Also, we pass bpf_attr union with all the user-provided arguments for BPF_PROG_LOAD command. This will give LSMs as much information as we can basically provide. The hook is also BPF token-aware now, and optional bpf_token struct is passed as a third argument. bpf_prog_load LSM hook is called after a bunch of sanity checks were performed, bpf_prog and bpf_prog_aux were allocated and filled out, but right before performing full-fledged BPF verification step. bpf_prog_free LSM hook is now accepting struct bpf_prog argument, for consistency. SELinux code is adjusted to all new names, types, and signatures. Note, given that bpf_prog_load (previously bpf_prog_alloc) hook can be used by some LSMs to allocate extra security blob, but also by other LSMs to reject BPF program loading, we need to make sure that bpf_prog_free LSM hook is called after bpf_prog_load/bpf_prog_alloc one *even* if the hook itself returned error. If we don't do that, we run the risk of leaking memory. This seems to be possible today when combining SELinux and BPF LSM, as one example, depending on their relative ordering. Also, for BPF LSM setup, add bpf_prog_load and bpf_prog_free to sleepable LSM hooks list, as they are both executed in sleepable context. Also drop bpf_prog_load hook from untrusted, as there is no issue with refcount or anything else anymore, that originally forced us to add it to untrusted list in c0c852dd1876 ("bpf: Do not mark certain LSM hook arguments as trusted"). We now trigger this hook much later and it should not be an issue anymore. [0] https://lore.kernel.org/bpf/9fe88aef7deabbe87d3fc38c4aea3c69.paul@paul-moore.com/ Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-10-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06bpf: consistently use BPF token throughout BPF verifier logicAndrii Nakryiko
Remove remaining direct queries to perfmon_capable() and bpf_capable() in BPF verifier logic and instead use BPF token (if available) to make decisions about privileges. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-9-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06bpf: take into account BPF token when fetching helper protosAndrii Nakryiko
Instead of performing unconditional system-wide bpf_capable() and perfmon_capable() calls inside bpf_base_func_proto() function (and other similar ones) to determine eligibility of a given BPF helper for a given program, use previously recorded BPF token during BPF_PROG_LOAD command handling to inform the decision. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-8-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06bpf: add BPF token support to BPF_PROG_LOAD commandAndrii Nakryiko
Add basic support of BPF token to BPF_PROG_LOAD. Wire through a set of allowed BPF program types and attach types, derived from BPF FS at BPF token creation time. Then make sure we perform bpf_token_capable() checks everywhere where it's relevant. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-7-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06bpf: add BPF token support to BPF_BTF_LOAD commandAndrii Nakryiko
Accept BPF token FD in BPF_BTF_LOAD command to allow BTF data loading through delegated BPF token. BTF loading is a pretty straightforward operation, so as long as BPF token is created with allow_cmds granting BPF_BTF_LOAD command, kernel proceeds to parsing BTF data and creating BTF object. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06bpf: add BPF token support to BPF_MAP_CREATE commandAndrii Nakryiko
Allow providing token_fd for BPF_MAP_CREATE command to allow controlled BPF map creation from unprivileged process through delegated BPF token. Wire through a set of allowed BPF map types to BPF token, derived from BPF FS at BPF token creation time. This, in combination with allowed_cmds allows to create a narrowly-focused BPF token (controlled by privileged agent) with a restrictive set of BPF maps that application can attempt to create. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06bpf: introduce BPF token objectAndrii Nakryiko
Add new kind of BPF kernel object, BPF token. BPF token is meant to allow delegating privileged BPF functionality, like loading a BPF program or creating a BPF map, from privileged process to a *trusted* unprivileged process, all while having a good amount of control over which privileged operations could be performed using provided BPF token. This is achieved through mounting BPF FS instance with extra delegation mount options, which determine what operations are delegatable, and also constraining it to the owning user namespace (as mentioned in the previous patch). BPF token itself is just a derivative from BPF FS and can be created through a new bpf() syscall command, BPF_TOKEN_CREATE, which accepts BPF FS FD, which can be attained through open() API by opening BPF FS mount point. Currently, BPF token "inherits" delegated command, map types, prog type, and attach type bit sets from BPF FS as is. In the future, having an BPF token as a separate object with its own FD, we can allow to further restrict BPF token's allowable set of things either at the creation time or after the fact, allowing the process to guard itself further from unintentionally trying to load undesired kind of BPF programs. But for now we keep things simple and just copy bit sets as is. When BPF token is created from BPF FS mount, we take reference to the BPF super block's owning user namespace, and then use that namespace for checking all the {CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, CAP_SYS_ADMIN} capabilities that are normally only checked against init userns (using capable()), but now we check them using ns_capable() instead (if BPF token is provided). See bpf_token_capable() for details. Such setup means that BPF token in itself is not sufficient to grant BPF functionality. User namespaced process has to *also* have necessary combination of capabilities inside that user namespace. So while previously CAP_BPF was useless when granted within user namespace, now it gains a meaning and allows container managers and sys admins to have a flexible control over which processes can and need to use BPF functionality within the user namespace (i.e., container in practice). And BPF FS delegation mount options and derived BPF tokens serve as a per-container "flag" to grant overall ability to use bpf() (plus further restrict on which parts of bpf() syscalls are treated as namespaced). Note also, BPF_TOKEN_CREATE command itself requires ns_capable(CAP_BPF) within the BPF FS owning user namespace, rounding up the ns_capable() story of BPF token. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06bpf: add BPF token delegation mount options to BPF FSAndrii Nakryiko
Add few new mount options to BPF FS that allow to specify that a given BPF FS instance allows creation of BPF token (added in the next patch), and what sort of operations are allowed under BPF token. As such, we get 4 new mount options, each is a bit mask - `delegate_cmds` allow to specify which bpf() syscall commands are allowed with BPF token derived from this BPF FS instance; - if BPF_MAP_CREATE command is allowed, `delegate_maps` specifies a set of allowable BPF map types that could be created with BPF token; - if BPF_PROG_LOAD command is allowed, `delegate_progs` specifies a set of allowable BPF program types that could be loaded with BPF token; - if BPF_PROG_LOAD command is allowed, `delegate_attachs` specifies a set of allowable BPF program attach types that could be loaded with BPF token; delegate_progs and delegate_attachs are meant to be used together, as full BPF program type is, in general, determined through both program type and program attach type. Currently, these mount options accept the following forms of values: - a special value "any", that enables all possible values of a given bit set; - numeric value (decimal or hexadecimal, determined by kernel automatically) that specifies a bit mask value directly; - all the values for a given mount option are combined, if specified multiple times. E.g., `mount -t bpf nodev /path/to/mount -o delegate_maps=0x1 -o delegate_maps=0x2` will result in a combined 0x3 mask. Ideally, more convenient (for humans) symbolic form derived from corresponding UAPI enums would be accepted (e.g., `-o delegate_progs=kprobe|tracepoint`) and I intend to implement this, but it requires a bunch of UAPI header churn, so I postponed it until this feature lands upstream or at least there is a definite consensus that this feature is acceptable and is going to make it, just to minimize amount of wasted effort and not increase amount of non-essential code to be reviewed. Attentive reader will notice that BPF FS is now marked as FS_USERNS_MOUNT, which theoretically makes it mountable inside non-init user namespace as long as the process has sufficient *namespaced* capabilities within that user namespace. But in reality we still restrict BPF FS to be mountable only by processes with CAP_SYS_ADMIN *in init userns* (extra check in bpf_fill_super()). FS_USERNS_MOUNT is added to allow creating BPF FS context object (i.e., fsopen("bpf")) from inside unprivileged process inside non-init userns, to capture that userns as the owning userns. It will still be required to pass this context object back to privileged process to instantiate and mount it. This manipulation is important, because capturing non-init userns as the owning userns of BPF FS instance (super block) allows to use that userns to constraint BPF token to that userns later on (see next patch). So creating BPF FS with delegation inside unprivileged userns will restrict derived BPF token objects to only "work" inside that intended userns, making it scoped to a intended "container". Also, setting these delegation options requires capable(CAP_SYS_ADMIN), so unprivileged process cannot set this up without involvement of a privileged process. There is a set of selftests at the end of the patch set that simulates this sequence of steps and validates that everything works as intended. But careful review is requested to make sure there are no missed gaps in the implementation and testing. This somewhat subtle set of aspects is the result of previous discussions ([0]) about various user namespace implications and interactions with BPF token functionality and is necessary to contain BPF token inside intended user namespace. [0] https://lore.kernel.org/bpf/20230704-hochverdient-lehne-eeb9eeef785e@brauner/ Acked-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-06ACPI: bus: update acpi_dev_hid_uid_match() to support multiple typesRaag Jadav
Now that we have _UID matching support for both integer and string types, we can support them into acpi_dev_hid_uid_match() helper as well. Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-06ACPI: bus: update acpi_dev_uid_match() to support multiple typesRaag Jadav
According to the ACPI specification, a _UID object can evaluate to either a numeric value or a string. Update acpi_dev_uid_match() to support _UID matching for both integer and string types. Suggested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Raag Jadav <raag.jadav@intel.com> [ rjw: Rename auxiliary macros, relocate kerneldoc comment ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-06Merge tag 'ffa-fixes-6.7' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes Arm FF-A fixes for v6.7 A bunch of fixes addressing issues around the notification support that was added this cycle. They address issue in partition IDs handling in ffa_notification_info_get(), notifications cleanup path and the size of the allocation in ffa_partitions_cleanup(). It also adds check for the notification enabled state so that the drivers registering the callbacks can be rejected if not enabled/supported. It also moves the partitions setup operation after the notification initialisation so that the driver has the correct state for notification enabled/supported before the partitions are initialised/setup. It also now allows FF-A initialisation to complete successfully even when the notification initialisation fails as it is an optional support in the specification. Initial support just allowed it only if the firmware didn't support notifications. Finally, it also adds a fix for smatch warning by declaring ffa_bus_type structure in the header. * tag 'ffa-fixes-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: firmware: arm_ffa: Fix ffa_notification_info_get() IDs handling firmware: arm_ffa: Fix the size of the allocation in ffa_partitions_cleanup() firmware: arm_ffa: Fix FFA notifications cleanup path firmware: arm_ffa: Add checks for the notification enabled state firmware: arm_ffa: Setup the partitions after the notification initialisation firmware: arm_ffa: Allow FF-A initialisation even when notification fails firmware: arm_ffa: Declare ffa_bus_type structure in the header Link: https://lore.kernel.org/r/20231116191603.929767-1-sudeep.holla@arm.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-12-06Merge tag 'asahi-soc-mailbox-6.8' of https://github.com/AsahiLinux/linux ↵Arnd Bergmann
into soc/drivers Apple SoC mailbox updates for 6.8 This moves the mailbox driver out of the mailbox subsystem and into SoC, next to its only consumer (RTKit). It has been cooking in linux-next for a long while, so it's time to pull it in. * tag 'asahi-soc-mailbox-6.8' of https://github.com/AsahiLinux/linux: soc: apple: mailbox: Add explicit include of platform_device.h soc: apple: mailbox: Rename config symbol to APPLE_MAILBOX mailbox: apple: Delete driver soc: apple: rtkit: Port to the internal mailbox driver soc: apple: mailbox: Add ASC/M3 mailbox driver soc: apple: rtkit: Get rid of apple_rtkit_send_message_wait Link: https://lore.kernel.org/r/6e64472e-c55d-4499-9a61-da59cfd28021@marcan.st Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2023-12-06cpu/hotplug: Remove unused CPU hotplug statesZenghui Yu
There are unused hotplug states which either have never been used or the removal of the usage did not remove the state constant. Drop them to reduce the size of the cpuhp_hp_states array. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20231124121615.1604-1-yuzenghui@huawei.com
2023-12-06dt-bindings: interconnect: Add Qualcomm SM6115 NoCKonrad Dybcio
Add bindings for Qualcomm SM6115 Network-On-Chip interconnect. Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20231125-topic-6115icc-v3-1-bd8907b8cfd7@linaro.org Signed-off-by: Georgi Djakov <djakov@kernel.org>
2023-12-06regulator: event: Add regulator netlink event supportNaresh Solanki
This commit introduces netlink event support to the regulator subsystem. Changes: - Introduce event.c and regnl.h for netlink event handling. - Implement reg_generate_netlink_event to broadcast regulator events. - Update Makefile to include the new event.c file. Signed-off-by: Naresh Solanki <naresh.solanki@9elements.com> Link: https://lore.kernel.org/r/20231205105207.1262928-1-naresh.solanki@9elements.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-12-06soc: microchip: mpfs: enable access to the system controller's flashConor Dooley
The system controller has a flash that contains images used to reprogram the FPGA using IAP (In-Application Programming). Introduce a function that allows a driver with a reference to the system controller to get one to a flash device attached to it. Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2023-12-06net/tcp: Don't store TCP-AO maclen on reqskDmitry Safonov
This extra check doesn't work for a handshake when SYN segment has (current_key.maclen != rnext_key.maclen). It could be amended to preserve rnext_key.maclen instead of current_key.maclen, but that requires a lookup on listen socket. Originally, this extra maclen check was introduced just because it was cheap. Drop it and convert tcp_request_sock::maclen into boolean tcp_request_sock::used_tcp_ao. Fixes: 06b22ef29591 ("net/tcp: Wire TCP-AO to request sockets") Signed-off-by: Dmitry Safonov <dima@arista.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-06net/tcp: Consistently align TCP-AO option in the headerDmitry Safonov
Currently functions that pre-calculate TCP header options length use unaligned TCP-AO header + MAC-length for skb reservation. And the functions that actually write TCP-AO options into skb do align the header. Nothing good can come out of this for ((maclen % 4) != 0). Provide tcp_ao_len_aligned() helper and use it everywhere for TCP header options space calculations. Fixes: 1e03d32bea8e ("net/tcp: Add TCP-AO sign to outgoing packets") Signed-off-by: Dmitry Safonov <dima@arista.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-06mm/slab: move the rest of slub_def.h to mm/slab.hVlastimil Babka
mm/slab.h is the only place to include include/linux/slub_def.h which has allowed switching between SLAB and SLUB. Now we can simply move the contents over and remove slub_def.h. Use this opportunity to fix up some whitespace (alignment) issues. Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: David Rientjes <rientjes@google.com> Tested-by: David Rientjes <rientjes@google.com> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2023-12-06mm/slab: move struct kmem_cache_cpu declaration to slub.cVlastimil Babka
Nothing outside SLUB itself accesses the struct kmem_cache_cpu fields so it does not need to be declared in slub_def.h. This allows also to move enum stat_item. Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: David Rientjes <rientjes@google.com> Tested-by: David Rientjes <rientjes@google.com> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2023-12-06mm/slab: remove mm/slab.c and slab_def.hVlastimil Babka
Remove the SLAB implementation. Update CREDITS. Also update and properly sort the SLOB entry there. RIP SLAB allocator (1996 - 2024) Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: David Rientjes <rientjes@google.com> Tested-by: David Rientjes <rientjes@google.com> Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2023-12-06wifi: cfg80211: make RX assoc data constJohannes Berg
This is just a collection of data and we only read it, so make it const. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2023-12-06HID: Intel-ish-hid: Ishtp: Add helper functions for client connectionEven Xu
For every ishtp client driver during initialization state, the flow is: 1 - Allocate an ISHTP client instance 2 - Reserve a host id and link the client instance 3 - Search a firmware client using UUID and get related client information 4 - Bind firmware client id to the ISHTP client instance 5 - Set the state the ISHTP client instance to CONNECTING 6 - Send connect request to firmware 7 - Register event callback for messages from the firmware During deinitizalization state, the flow is: 9 - Set the state the ISHTP client instance to ISHTP_CL_DISCONNECTING 10 - Issue disconnect request to firmware 11 - Unlike the client instance 12 - Flush message queue 13 - Free ISHTP client instance Step 2-7 are identical to the steps of client driver initialization and driver reset flow, but reallocation of the RX/TX ring buffers can be avoided in reset flow. Also for step 9-12, they are identical to the steps of client driver failure handling after connect request, driver reset flow and driver removing. So, add two helper functions to simplify client driver code. ishtp_cl_establish_connection() ishtp_cl_destroy_connection() No functional changes are expected. Signed-off-by: Even Xu <even.xu@intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2023-12-06drm/atomic-helpers: Invoke end_fb_access while owning plane stateThomas Zimmermann
Invoke drm_plane_helper_funcs.end_fb_access before drm_atomic_helper_commit_hw_done(). The latter function hands over ownership of the plane state to the following commit, which might free it. Releasing resources in end_fb_access then operates on undefined state. This bug has been observed with non-blocking commits when they are being queued up quickly. Here is an example stack trace from the bug report. The plane state has been free'd already, so the pages for drm_gem_fb_vunmap() are gone. Unable to handle kernel paging request at virtual address 0000000100000049 [...] drm_gem_fb_vunmap+0x18/0x74 drm_gem_end_shadow_fb_access+0x1c/0x2c drm_atomic_helper_cleanup_planes+0x58/0xd8 drm_atomic_helper_commit_tail+0x90/0xa0 commit_tail+0x15c/0x188 commit_work+0x14/0x20 Fix this by running end_fb_access immediately after updating all planes in drm_atomic_helper_commit_planes(). The existing clean-up helper drm_atomic_helper_cleanup_planes() now only handles cleanup_fb. For aborted commits, roll back from drm_atomic_helper_prepare_planes() in the new helper drm_atomic_helper_unprepare_planes(). This case is different from regular cleanup, as we have to release the new state; regular cleanup releases the old state. The new helper also invokes cleanup_fb for all planes. The changes mostly involve DRM's atomic helpers. Only two drivers, i915 and nouveau, implement their own commit function. Update them to invoke drm_atomic_helper_unprepare_planes(). Drivers with custom commit_tail function do not require changes. v4: * fix documentation (kernel test robot) v3: * add drm_atomic_helper_unprepare_planes() for rolling back * use correct state for end_fb_access v2: * fix test in drm_atomic_helper_cleanup_planes() Reported-by: Alyssa Ross <hi@alyssa.is> Closes: https://lore.kernel.org/dri-devel/87leazm0ya.fsf@alyssa.is/ Suggested-by: Daniel Vetter <daniel@ffwll.ch> Fixes: 94d879eaf7fb ("drm/atomic-helper: Add {begin,end}_fb_access to plane helpers") Tested-by: Alyssa Ross <hi@alyssa.is> Reviewed-by: Alyssa Ross <hi@alyssa.is> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Cc: <stable@vger.kernel.org> # v6.2+ Link: https://patchwork.freedesktop.org/patch/msgid/20231204083247.22006-1-tzimmermann@suse.de
2023-12-06r8152: add vendor/device ID pair for ASUS USB-C2500Kelly Kane
The ASUS USB-C2500 is an RTL8156 based 2.5G Ethernet controller. Add the vendor and product ID values to the driver. This makes Ethernet work with the adapter. Signed-off-by: Kelly Kane <kelly@hawknetworks.com> Link: https://lore.kernel.org/r/20231203011712.6314-1-kelly@hawknetworks.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-06drm/plane-helper: Move drm_plane_helper_atomic_check() into udlThomas Zimmermann
The udl driver is the only caller of drm_plane_helper_atomic_check(). Move the function into the driver. No functional changes. v2: * fix documenation (Sui) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Sui Jingfeng <suijingfeng@loongson.cn> Link: https://patchwork.freedesktop.org/patch/msgid/20231204090852.1650-2-tzimmermann@suse.de
2023-12-06drm: Remove source code for non-KMS driversThomas Zimmermann
Remove all remaining source code for non-KMS drivers. These drivers have been removed in v6.3 and won't comeback. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: David Airlie <airlied@gmail.com> Reviewed-by: Daniel Vetter <daniel@ffwll.ch> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231122122449.11588-13-tzimmermann@suse.de