summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-12-09skbuff: Introduce slab_build_skb()Kees Cook
syzkaller reported: BUG: KASAN: slab-out-of-bounds in __build_skb_around+0x235/0x340 net/core/skbuff.c:294 Write of size 32 at addr ffff88802aa172c0 by task syz-executor413/5295 For bpf_prog_test_run_skb(), which uses a kmalloc()ed buffer passed to build_skb(). When build_skb() is passed a frag_size of 0, it means the buffer came from kmalloc. In these cases, ksize() is used to find its actual size, but since the allocation may not have been made to that size, actually perform the krealloc() call so that all the associated buffer size checking will be correctly notified (and use the "new" pointer so that compiler hinting works correctly). Split this logic out into a new interface, slab_build_skb(), but leave the original 0 checking for now to catch any stragglers. Reported-by: syzbot+fda18eaa8c12534ccb3b@syzkaller.appspotmail.com Link: https://groups.google.com/g/syzkaller-bugs/c/UnIKxTtU5-0/m/-wbXinkgAQAJ Fixes: 38931d8989b5 ("mm: Make ksize() a reporting-only function") Cc: Pavel Begunkov <asml.silence@gmail.com> Cc: pepsipu <soopthegoop@gmail.com> Cc: syzbot+fda18eaa8c12534ccb3b@syzkaller.appspotmail.com Cc: Vlastimil Babka <vbabka@suse.cz> Cc: kasan-dev <kasan-dev@googlegroups.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: ast@kernel.org Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Hao Luo <haoluo@google.com> Cc: Jesper Dangaard Brouer <hawk@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: jolsa@kernel.org Cc: KP Singh <kpsingh@kernel.org> Cc: martin.lau@linux.dev Cc: Stanislav Fomichev <sdf@google.com> Cc: song@kernel.org Cc: Yonghong Song <yhs@fb.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20221208060256.give.994-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-09net: bcmgenet: Remove the unused functionJiapeng Chong
The function dmadesc_get_addr() is defined in the bcmgenet.c file, but not called elsewhere, so remove this unused function. drivers/net/ethernet/broadcom/genet/bcmgenet.c:120:26: warning: unused function 'dmadesc_get_addr'. Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3401 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Link: https://lore.kernel.org/r/20221209033723.32452-1-jiapeng.chong@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-09Merge branch 'mptcp-miscellaneous-cleanup'Jakub Kicinski
Mat Martineau says: ==================== mptcp: Miscellaneous cleanup Two code cleanup patches for the 6.2 merge window that don't change behavior: Patch 1 makes proper use of nlmsg_free(), as suggested by Jakub while reviewing f8c9dfbd875b ("mptcp: add pm listener events"). Patch 2 clarifies success status in a few mptcp functions, which prevents some smatch false positives. ==================== Link: https://lore.kernel.org/r/20221209004431.143701-1-mathew.j.martineau@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-09mptcp: return 0 instead of 'err' varMatthieu Baerts
When 'err' is 0, it looks clearer to return '0' instead of the variable called 'err'. The behaviour is then not modified, just a clearer code. By doing this, we can also avoid false positive smatch warnings like this one: net/mptcp/pm_netlink.c:1169 mptcp_pm_parse_pm_addr_attr() warn: missing error code? 'err' Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Suggested-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-09mptcp: use nlmsg_free instead of kfree_skbGeliang Tang
Use nlmsg_free() instead of kfree_skb() in pm_netlink.c. The SKB's have been created by nlmsg_new(). The proper cleaning way should then be done with nlmsg_free(). For the moment, nlmsg_free() is simply calling kfree_skb() so we don't change the behaviour here. Suggested-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-09Merge tag 'mlx5-updates-2022-12-08' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2022-12-08 1) Support range match action in SW steering Yevgeny Kliteynik says: ======================= The following patch series adds support for a range match action in SW Steering. SW steering is able to match only on the exact values of the packet fields, as requested by the user: the user provides mask for the fields that are of interest, and the exact values to be matched on when the traffic is handled. The following patch series add new type of action - Range Match, where the user provides a field to be matched on and a range of values (min to max) that will be considered as hit. There are several new notions that were implemented in order to support Range Match: - MATCH_RANGES Steering Table Entry (STE): the new STE type that allows matching the packets' fields on the range of values instead of a specific value. - Match Definer: this is a general FW object that defines which fields in the packet will be referenced by the mask and tag of each STE. Match definer ID is part of STE fields, and it defines how the HW needs to interpret the STE's mask/tag values. Till now SW steering used the definers that were managed by FW and implemented the STE layout as described by the HW spec. Now that we're adding a new type of STE, SW steering needs to also be able to define this new STE's layout, and this is do ======================= 2) From OZ add support for meter mtu offload 2.1: Refactor the code to allow both metering and range post actions as a pre-step for adding police mtu offload support. 2.2: Instantiate mtu green/red flow tables with a single match-all rule. Add the green/red actions to the hit/miss table accordingly 2.3: Initialize the meter object with the TC police mtu parameter. Use the hardware range match action feature. 3) From MaorD, support routes with more than 2 nexthops in multipath 4) Michael and Or, improve and extend vport representor counters. * tag 'mlx5-updates-2022-12-08' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: Expose steering dropped packets counter net/mlx5: Refactor and expand rep vport stat group net/mlx5e: multipath, support routes with more than 2 nexthops net/mlx5e: TC, add support for meter mtu offload net/mlx5e: meter, add mtu post meter tables net/mlx5e: meter, refactor to allow multiple post meter tables net/mlx5: DR, Add support for range match action net/mlx5: DR, Add function that tells if STE miss addr has been initialized net/mlx5: DR, Some refactoring of miss address handling net/mlx5: DR, Manage definers with refcounts net/mlx5: DR, Handle FT action in a separate function net/mlx5: DR, Rework is_fw_table function net/mlx5: DR, Add functions to create/destroy MATCH_DEFINER general object net/mlx5: fs, add match on ranges API net/mlx5: mlx5_ifc updates for MATCH_DEFINER general object ==================== Link: https://lore.kernel.org/r/20221209001420.142794-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-09Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-12-08 (ice) Jacob Keller says: This series of patches primarily consists of changes to fix some corner cases that can cause Tx timestamp failures. The issues were discovered and reported by Siddaraju DH and primarily affect E822 hardware, though this series also includes some improvements that affect E810 hardware as well. The primary issue is regarding the way that E822 determines when to generate timestamp interrupts. If the driver reads timestamp indexes which do not have a valid timestamp, the E822 interrupt tracking logic can get stuck. This is due to the way that E822 hardware tracks timestamp index reads internally. I was previously unaware of this behavior as it is significantly different in E810 hardware. Most of the fixes target refactors to ensure that the ice driver does not read timestamp indexes which are not valid on E822 hardware. This is done by using the Tx timestamp ready bitmap register from the PHY. This register indicates what timestamp indexes have outstanding timestamps waiting to be captured. Care must be taken in all cases where we read the timestamp registers, and thus all flows which might have read these registers are refactored. The ice_ptp_tx_tstamp function is modified to consolidate as much of the logic relating to these registers as possible. It now handles discarding stale timestamps which are old or which occurred after a PHC time update. This replaces previously standalone thread functions like the periodic work function and the ice_ptp_flush_tx_tracker function. In addition, some minor cleanups noticed while writing these refactors are included. The remaining patches refactor the E822 implementation to remove the "bypass" mode for timestamps. The E822 hardware has the ability to provide a more precise timestamp by making use of measurements of the precise way that packets flow through the hardware pipeline. These measurements are known as "Vernier" calibration. The "bypass" mode disables many of these measurements in favor of a faster start up time for Tx and Rx timestamping. Instead, once these measurements were captured, the driver tries to reconfigure the PHY to enable the vernier calibrations. Unfortunately this recalibration does not work. Testing indicates that the PHY simply remains in bypass mode without the increased timestamp precision. Remove the attempt at recalibration and always use vernier mode. This has one disadvantage that Tx and Rx timestamps cannot begin until after at least one packet of that type goes through the hardware pipeline. Because of this, further refactor the driver to separate Tx and Rx vernier calibration. Complete the Tx and Rx independently, enabling the appropriate type of timestamp as soon as the relevant packet has traversed the hardware pipeline. This was reported by Milena Olech. Note that although these might be considered "bug fixes", the required changes in order to appropriately resolve these issues is large. Thus it does not feel suitable to send this series to net. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: reschedule ice_ptp_wait_for_offset_valid during reset ice: make Tx and Rx vernier offset calibration independent ice: only check set bits in ice_ptp_flush_tx_tracker ice: handle flushing stale Tx timestamps in ice_ptp_tx_tstamp ice: cleanup allocations in ice_ptp_alloc_tx_tracker ice: protect init and calibrating check in ice_ptp_request_ts ice: synchronize the misc IRQ when tearing down Tx tracker ice: check Tx timestamp memory register for ready timestamps ice: handle discarding old Tx requests in ice_ptp_tx_tstamp ice: always call ice_ptp_link_change and make it void ice: fix misuse of "link err" with "link status" ice: Reset TS memory for all quads ice: Remove the E822 vernier "bypass" logic ice: Use more generic names for ice_ptp_tx fields ==================== Link: https://lore.kernel.org/r/20221208213932.1274143-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-09Merge branch 'mm-hotfixes-stable' into mm-stableAndrew Morton
2022-12-09memcg: fix possible use-after-free in memcg_write_event_control()Tejun Heo
memcg_write_event_control() accesses the dentry->d_name of the specified control fd to route the write call. As a cgroup interface file can't be renamed, it's safe to access d_name as long as the specified file is a regular cgroup file. Also, as these cgroup interface files can't be removed before the directory, it's safe to access the parent too. Prior to 347c4a874710 ("memcg: remove cgroup_event->cft"), there was a call to __file_cft() which verified that the specified file is a regular cgroupfs file before further accesses. The cftype pointer returned from __file_cft() was no longer necessary and the commit inadvertently dropped the file type check with it allowing any file to slip through. With the invarients broken, the d_name and parent accesses can now race against renames and removals of arbitrary files and cause use-after-free's. Fix the bug by resurrecting the file type check in __file_cft(). Now that cgroupfs is implemented through kernfs, checking the file operations needs to go through a layer of indirection. Instead, let's check the superblock and dentry type. Link: https://lkml.kernel.org/r/Y5FRm/cfcKPGzWwl@slm.duckdns.org Fixes: 347c4a874710 ("memcg: remove cgroup_event->cft") Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Jann Horn <jannh@google.com> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: <stable@vger.kernel.org> [3.14+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-09MAINTAINERS: update Muchun Song's emailMuchun Song
I'm moving to the @linux.dev account. Map my old addresses and update it to my new address. Link: https://lkml.kernel.org/r/20221208115548.85244-1-songmuchun@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-09mm/gup: fix gup_pud_range() for daxJohn Starks
For dax pud, pud_huge() returns true on x86. So the function works as long as hugetlb is configured. However, dax doesn't depend on hugetlb. Commit 414fd080d125 ("mm/gup: fix gup_pmd_range() for dax") fixed devmap-backed huge PMDs, but missed devmap-backed huge PUDs. Fix this as well. This fixes the below kernel panic: general protection fault, probably for non-canonical address 0x69e7c000cc478: 0000 [#1] SMP < snip > Call Trace: <TASK> get_user_pages_fast+0x1f/0x40 iov_iter_get_pages+0xc6/0x3b0 ? mempool_alloc+0x5d/0x170 bio_iov_iter_get_pages+0x82/0x4e0 ? bvec_alloc+0x91/0xc0 ? bio_alloc_bioset+0x19a/0x2a0 blkdev_direct_IO+0x282/0x480 ? __io_complete_rw_common+0xc0/0xc0 ? filemap_range_has_page+0x82/0xc0 generic_file_direct_write+0x9d/0x1a0 ? inode_update_time+0x24/0x30 __generic_file_write_iter+0xbd/0x1e0 blkdev_write_iter+0xb4/0x150 ? io_import_iovec+0x8d/0x340 io_write+0xf9/0x300 io_issue_sqe+0x3c3/0x1d30 ? sysvec_reschedule_ipi+0x6c/0x80 __io_queue_sqe+0x33/0x240 ? fget+0x76/0xa0 io_submit_sqes+0xe6a/0x18d0 ? __fget_light+0xd1/0x100 __x64_sys_io_uring_enter+0x199/0x880 ? __context_tracking_enter+0x1f/0x70 ? irqentry_exit_to_user_mode+0x24/0x30 ? irqentry_exit+0x1d/0x30 ? __context_tracking_exit+0xe/0x70 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x61/0xcb RIP: 0033:0x7fc97c11a7be < snip > </TASK> ---[ end trace 48b2e0e67debcaeb ]--- RIP: 0010:internal_get_user_pages_fast+0x340/0x990 < snip > Kernel panic - not syncing: Fatal exception Kernel Offset: disabled Link: https://lkml.kernel.org/r/1670392853-28252-1-git-send-email-ssengar@linux.microsoft.com Fixes: 414fd080d125 ("mm/gup: fix gup_pmd_range() for dax") Signed-off-by: John Starks <jostarks@microsoft.com> Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> Cc: Jan Kara <jack@suse.cz> Cc: Yu Zhao <yuzhao@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: David Hildenbrand <david@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-09mmap: fix do_brk_flags() modifying obviously incorrect VMAsLiam Howlett
Add more sanity checks to the VMA that do_brk_flags() will expand. Ensure the VMA matches basic merge requirements within the function before calling can_vma_merge_after(). Drop the duplicate checks from vm_brk_flags() since they will be enforced later. The old code would expand file VMAs on brk(), which is functionally wrong and also dangerous in terms of locking because the brk() path isn't designed for file VMAs and therefore doesn't lock the file mapping. Checking can_vma_merge_after() ensures that new anonymous VMAs can't be merged into file VMAs. See https://lore.kernel.org/linux-mm/CAG48ez1tJZTOjS_FjRZhvtDA-STFmdw8PEizPDwMGFd_ui0Nrw@mail.gmail.com/ Link: https://lkml.kernel.org/r/20221205192304.1957418-1-Liam.Howlett@oracle.com Fixes: 2e7ce7d354f2 ("mm/mmap: change do_brk_flags() to expand existing VMA and add do_brk_munmap()") Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com> Suggested-by: Jann Horn <jannh@google.com> Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: SeongJae Park <sj@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-09mm/swap: fix SWP_PFN_BITS with CONFIG_PHYS_ADDR_T_64BIT on 32bitDavid Hildenbrand
We use "unsigned long" to store a PFN in the kernel and phys_addr_t to store a physical address. On a 64bit system, both are 64bit wide. However, on a 32bit system, the latter might be 64bit wide. This is, for example, the case on x86 with PAE: phys_addr_t and PTEs are 64bit wide, while "unsigned long" only spans 32bit. The current definition of SWP_PFN_BITS without MAX_PHYSMEM_BITS misses that case, and assumes that the maximum PFN is limited by an 32bit phys_addr_t. This implies, that SWP_PFN_BITS will currently only be able to cover 4 GiB - 1 on any 32bit system with 4k page size, which is wrong. Let's rely on the number of bits in phys_addr_t instead, but make sure to not exceed the maximum swap offset, to not make the BUILD_BUG_ON() in is_pfn_swap_entry() unhappy. Note that swp_entry_t is effectively an unsigned long and the maximum swap offset shares that value with the swap type. For example, on an 8 GiB x86 PAE system with a kernel config based on Debian 11.5 (-> CONFIG_FLATMEM=y, CONFIG_X86_PAE=y), we will currently fail removing migration entries (remove_migration_ptes()), because mm/page_vma_mapped.c:check_pte() will fail to identify a PFN match as swp_offset_pfn() wrongly masks off PFN bits. For example, split_huge_page_to_list()->...->remap_page() will leave migration entries in place and continue to unlock the page. Later, when we stumble over these migration entries (e.g., via /proc/self/pagemap), pfn_swap_entry_to_page() will BUG_ON() because these migration entries shouldn't exist anymore and the page was unlocked. [ 33.067591] kernel BUG at include/linux/swapops.h:497! [ 33.067597] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 33.067602] CPU: 3 PID: 742 Comm: cow Tainted: G E 6.1.0-rc8+ #16 [ 33.067605] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014 [ 33.067606] EIP: pagemap_pmd_range+0x644/0x650 [ 33.067612] Code: 00 00 00 00 66 90 89 ce b9 00 f0 ff ff e9 ff fb ff ff 89 d8 31 db e8 48 c6 52 00 e9 23 fb ff ff e8 61 83 56 00 e9 b6 fe ff ff <0f> 0b bf 00 f0 ff ff e9 38 fa ff ff 3e 8d 74 26 00 55 89 e5 57 31 [ 33.067615] EAX: ee394000 EBX: 00000002 ECX: ee394000 EDX: 00000000 [ 33.067617] ESI: c1b0ded4 EDI: 00024a00 EBP: c1b0ddb4 ESP: c1b0dd68 [ 33.067619] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010246 [ 33.067624] CR0: 80050033 CR2: b7a00000 CR3: 01bbbd20 CR4: 00350ef0 [ 33.067625] Call Trace: [ 33.067628] ? madvise_free_pte_range+0x720/0x720 [ 33.067632] ? smaps_pte_range+0x4b0/0x4b0 [ 33.067634] walk_pgd_range+0x325/0x720 [ 33.067637] ? mt_find+0x1d6/0x3a0 [ 33.067641] ? mt_find+0x1d6/0x3a0 [ 33.067643] __walk_page_range+0x164/0x170 [ 33.067646] walk_page_range+0xf9/0x170 [ 33.067648] ? __kmem_cache_alloc_node+0x2a8/0x340 [ 33.067653] pagemap_read+0x124/0x280 [ 33.067658] ? default_llseek+0x101/0x160 [ 33.067662] ? smaps_account+0x1d0/0x1d0 [ 33.067664] vfs_read+0x90/0x290 [ 33.067667] ? do_madvise.part.0+0x24b/0x390 [ 33.067669] ? debug_smp_processor_id+0x12/0x20 [ 33.067673] ksys_pread64+0x58/0x90 [ 33.067675] __ia32_sys_ia32_pread64+0x1b/0x20 [ 33.067680] __do_fast_syscall_32+0x4c/0xc0 [ 33.067683] do_fast_syscall_32+0x29/0x60 [ 33.067686] do_SYSENTER_32+0x15/0x20 [ 33.067689] entry_SYSENTER_32+0x98/0xf1 Decrease the indentation level of SWP_PFN_BITS and SWP_PFN_MASK to keep it readable and consistent. [david@redhat.com: rely on sizeof(phys_addr_t) and min_t() instead] Link: https://lkml.kernel.org/r/20221206105737.69478-1-david@redhat.com [david@redhat.com: use "int" for comparison, as we're only comparing numbers < 64] Link: https://lkml.kernel.org/r/1f157500-2676-7cef-a84e-9224ed64e540@redhat.com Link: https://lkml.kernel.org/r/20221205150857.167583-1-david@redhat.com Fixes: 0d206b5d2e0d ("mm/swap: add swp_offset_pfn() to fetch PFN from swap entry") Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Yang Shi <shy828301@gmail.com> Cc: Hugh Dickins <hughd@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-09tmpfs: fix data loss from failed fallocateHugh Dickins
Fix tmpfs data loss when the fallocate system call is interrupted by a signal, or fails for some other reason. The partial folio handling in shmem_undo_range() forgot to consider this unfalloc case, and was liable to erase or truncate out data which had already been committed earlier. It turns out that none of the partial folio handling there is appropriate for the unfalloc case, which just wants to proceed to removal of whole folios: which find_get_entries() provides, even when partially covered. Original patch by Rui Wang. Link: https://lore.kernel.org/linux-mm/33b85d82.7764.1842e9ab207.Coremail.chenguoqic@163.com/ Link: https://lkml.kernel.org/r/a5dac112-cf4b-7af-a33-f386e347fd38@google.com Fixes: b9a8a4195c7d ("truncate,shmem: Handle truncates that split large folios") Signed-off-by: Hugh Dickins <hughd@google.com> Reported-by: Guoqi Chen <chenguoqic@163.com> Link: https://lore.kernel.org/all/20221101032248.819360-1-kernel@hev.cc/ Cc: Rui Wang <kernel@hev.cc> Cc: Huacai Chen <chenhuacai@loongson.cn> Cc: Matthew Wilcox <willy@infradead.org> Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com> Cc: <stable@vger.kernel.org> [5.17+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-09kselftests: cgroup: update kmem test precision toleranceMichal Hocko
1813e51eece0 ("memcg: increase MEMCG_CHARGE_BATCH to 64") has changed the batch size while this test case has been left behind. This has led to a test failure reported by test bot: not ok 2 selftests: cgroup: test_kmem # exit=1 Update the tolerance for the pcp charges to reflect the MEMCG_CHARGE_BATCH change to fix this. [akpm@linux-foundation.org: update comments, per Roman] Link: https://lkml.kernel.org/r/Y4m8Unt6FhWKC6IH@dhcp22.suse.cz Fixes: 1813e51eece0a ("memcg: increase MEMCG_CHARGE_BATCH to 64") Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: kernel test robot <yujie.liu@intel.com> Link: https://lore.kernel.org/oe-lkp/202212010958.c1053bd3-yujie.liu@intel.com Acked-by: Shakeel Butt <shakeelb@google.com> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Tested-by: Yujie Liu <yujie.liu@intel.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: "Michal Koutný" <mkoutny@suse.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-09mm: do not BUG_ON missing brk mapping, because userspace can unmap itJason A. Donenfeld
The following program will trigger the BUG_ON that this patch removes, because the user can munmap() mm->brk: #include <sys/syscall.h> #include <sys/mman.h> #include <assert.h> #include <unistd.h> static void *brk_now(void) { return (void *)syscall(SYS_brk, 0); } static void brk_set(void *b) { assert(syscall(SYS_brk, b) != -1); } int main(int argc, char *argv[]) { void *b = brk_now(); brk_set(b + 4096); assert(munmap(b - 4096, 4096 * 2) == 0); brk_set(b); return 0; } Compile that with musl, since glibc actually uses brk(), and then execute it, and it'll hit this splat: kernel BUG at mm/mmap.c:229! invalid opcode: 0000 [#1] PREEMPT SMP CPU: 12 PID: 1379 Comm: a.out Tainted: G S U 6.1.0-rc7+ #419 RIP: 0010:__do_sys_brk+0x2fc/0x340 Code: 00 00 4c 89 ef e8 04 d3 fe ff eb 9a be 01 00 00 00 4c 89 ff e8 35 e0 fe ff e9 6e ff ff ff 4d 89 a7 20> RSP: 0018:ffff888140bc7eb0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 00000000007e7000 RCX: ffff8881020fe000 RDX: ffff8881020fe001 RSI: ffff8881955c9b00 RDI: ffff8881955c9b08 RBP: 0000000000000000 R08: ffff8881955c9b00 R09: 00007ffc77844000 R10: 0000000000000000 R11: 0000000000000001 R12: 00000000007e8000 R13: 00000000007e8000 R14: 00000000007e7000 R15: ffff8881020fe000 FS: 0000000000604298(0000) GS:ffff88901f700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000603fe0 CR3: 000000015ba9a005 CR4: 0000000000770ee0 PKRU: 55555554 Call Trace: <TASK> do_syscall_64+0x2b/0x50 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x400678 Code: 10 4c 8d 41 08 4c 89 44 24 10 4c 8b 01 8b 4c 24 08 83 f9 2f 77 0a 4c 8d 4c 24 20 4c 01 c9 eb 05 48 8b> RSP: 002b:00007ffc77863890 EFLAGS: 00000212 ORIG_RAX: 000000000000000c RAX: ffffffffffffffda RBX: 000000000040031b RCX: 0000000000400678 RDX: 00000000004006a1 RSI: 00000000007e6000 RDI: 00000000007e7000 RBP: 00007ffc77863900 R08: 0000000000000000 R09: 00000000007e6000 R10: 00007ffc77863930 R11: 0000000000000212 R12: 00007ffc77863978 R13: 00007ffc77863988 R14: 0000000000000000 R15: 0000000000000000 </TASK> Instead, just return the old brk value if the original mapping has been removed. [akpm@linux-foundation.org: fix changelog, per Liam] Link: https://lkml.kernel.org/r/20221202162724.2009-1-Jason@zx2c4.com Fixes: 2e7ce7d354f2 ("mm/mmap: change do_brk_flags() to expand existing VMA and add do_brk_munmap()") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: SeongJae Park <sj@kernel.org> Cc: Yu Zhao <yuzhao@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Matthew Wilcox <willy@infradead.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Jann Horn <jannh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-09mailmap: update Matti Vaittinen's email addressMatti Vaittinen
The email backend used by ROHM keeps labeling patches as spam. This can result in missing the patches. Switch my mail address from a company mail to a personal one. Link: https://lkml.kernel.org/r/8f4498b66fedcbded37b3b87e0c516e659f8f583.1669912977.git.mazziesaccount@gmail.com Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com> Suggested-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Cc: Anup Patel <anup@brainfault.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Atish Patra <atishp@atishpatra.org> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Ben Widawsky <bwidawsk@kernel.org> Cc: Bjorn Andersson <andersson@kernel.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Kirill Tkhai <tkhai@ya.ru> Cc: Qais Yousef <qyousef@layalina.io> Cc: Vasily Averin <vasily.averin@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-09RDMA/rxe: Enable RDMA FLUSH capability for rxe deviceLi Zhijian
Now we are ready to enable RDMA FLUSH capability for RXE. It can support Global Visibility and Persistence placement types. Link: https://lore.kernel.org/r/20221206130201.30986-11-lizhijian@fujitsu.com Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09RDMA/cm: Make QP FLUSHABLE for supported deviceLi Zhijian
Similar to RDMA and Atomic qp attributes enabled by default in CM, enable FLUSH attribute for supported device. That makes applications that are built with rdma_create_ep, rdma_accept APIs have FLUSH qp attribute natively so that user is able to request FLUSH operation simpler. Note that, a FLUSH operation requires FLUSH are supported by both device(HCA) and memory region(MR) and QP at the same time, so it's safe to enable FLUSH qp attribute by default here. FLUSH attribute can be disable by modify_qp() interface. Link: https://lore.kernel.org/r/20221206130201.30986-10-lizhijian@fujitsu.com Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09RDMA/rxe: Implement flush completionLi Zhijian
Per IBA SPEC, FLUSH will ack in rdma read response with 0 length. Use IB_WC_FLUSH (aka IB_UVERBS_WC_FLUSH) code to tell userspace a FLUSH completion. Link: https://lore.kernel.org/r/20221206130201.30986-9-lizhijian@fujitsu.com Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09RDMA/rxe: Implement flush execution in responder sideLi Zhijian
Only the requested placement types that also registered in the destination memory region are acceptable. Otherwise, responder will also reply NAK "Remote Access Error" if it found a placement type violation. We will persist data via arch_wb_cache_pmem(), which could be architecture specific. This commit also adds 2 helpers to update qp.resp from the incoming packet. Link: https://lore.kernel.org/r/20221206130201.30986-8-lizhijian@fujitsu.com Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09RDMA/rxe: Implement RC RDMA FLUSH service in requester sideLi Zhijian
Implement FLUSH request operation in the requester. Link: https://lore.kernel.org/r/20221206130201.30986-7-lizhijian@fujitsu.com Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09RDMA/rxe: Extend rxe packet format to support flushLi Zhijian
Extend rxe opcode tables, headers, helper and constants to support flush operations. Refer to the IBA A19.4.1 for more FETH definition details Link: https://lore.kernel.org/r/20221206130201.30986-6-lizhijian@fujitsu.com Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09RDMA/rxe: Allow registering persistent flag for pmem MR onlyLi Zhijian
Memory region could support at most 2 flush access flags: IB_ACCESS_FLUSH_PERSISTENT and IB_ACCESS_FLUSH_GLOBAL But we only allow user to register persistent flush flags to the pmem MR where it has the ability of persisting data across power cycles. So registering a persistent access flag to a non-pmem MR will be rejected. Link: https://lore.kernel.org/r/20221206130201.30986-5-lizhijian@fujitsu.com CC: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09RDMA/rxe: Extend rxe user ABI to support flushLi Zhijian
This commit extends the rxe user ABI to support the flush operation defined in IBA A19.4.1. These changes are backward compatible with the existing rxe user ABI. The user API request a flush by filling this structure. Link: https://lore.kernel.org/r/20221206130201.30986-4-lizhijian@fujitsu.com Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09RDMA: Extend RDMA kernel verbs ABI to support flushLi Zhijian
This commit extends the RDMA kernel verbs ABI to support the flush operation defined in IBA A19.4.1. These changes are backward compatible with the existing RDMA kernel verbs ABI. It makes device/HCA support new FLUSH attributes/capabilities, and it also makes memory region support new FLUSH access flags. Users can use ibv_reg_mr(3) to register flush access flags. Only the access flags also supported by device's capabilities can be registered successfully. Once registered successfully, it means the MR is flushable. Similarly, A flushable MR should also have one or both of GLOBAL_VISIBILITY and PERSISTENT attributes/capabilities like device/HCA. Link: https://lore.kernel.org/r/20221206130201.30986-3-lizhijian@fujitsu.com Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09RDMA: Extend RDMA user ABI to support flushLi Zhijian
This commit extends the RDMA user ABI to support the flush operation defined in IBA A19.4.1. These changes are backward compatible with the existing RDMA user ABI. Link: https://lore.kernel.org/r/20221206130201.30986-2-lizhijian@fujitsu.com Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09RDMA/rxe: Fix incorrect responder length checkingBob Pearson
The code in rxe_resp.c at check_length() is incorrect as it compares pkt->opcode an 8 bit value against various mask bits which are all higher than 256 so nothing is ever reported. This patch rewrites this to compare against pkt->mask which is correct. However this now exposes another error. For UD send packets the value of the pmtu cannot be determined from qp->mtu. All that is required here is to later check if the payload fits into the posted receive buffer in that case. Fixes: 837a55847ead ("RDMA/rxe: Implement packet length validation on responder") Link: https://lore.kernel.org/r/20221208210945.28607-1-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Reviewed-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09Documentation/rv: Add verification/rv man pagesDaniel Bristot de Oliveira
Add man pages for the rv command line, using the same scheme we used in rtla. Link: https://lkml.kernel.org/r/e841d7cfbdfc3ebdaf7cbd40278571940145d829.1668180100.git.bristot@kernel.org Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-12-09tools/rv: Add in-kernel monitor interfaceDaniel Bristot de Oliveira
Add the ability to control and trace in-kernel monitors. This is a generic interface, it will check for existing monitors and enable standard setup, like enabling reactors. For example: # rv list wip wakeup in preemptive per-cpu testing monitor. [OFF] wwnr wakeup while not running per-task testing model. [OFF] # rv mon wwnr --help rv version 6.1.0-rc4: help usage: rv mon wwnr [-h] [-q] [-r reactor] [-s] [-v] -h/--help: print this menu and the reactor list -r/--reactor 'reactor': enables the 'reactor' -s/--self: when tracing (-t), also trace rv command -t/--trace: trace monitor's event -v/--verbose: print debug messages available reactors: nop printk panic # rv mon wwnr --trace <TASK>-PID [CPU] TYPE ID STATE x EVENT -> NEXT_STATE FINAL | | | | | | | | | rv-3613 [001] event 3613 running x switch_out -> not_running Y sshd-1248 [005] event 1248 running x switch_out -> not_running Y <idle>-0 [005] event 71 not_running x wakeup -> not_running Y <idle>-0 [005] event 71 not_running x switch_in -> running N kcompactd0-71 [005] event 71 running x switch_out -> not_running Y <idle>-0 [000] event 860 not_running x wakeup -> not_running Y <idle>-0 [000] event 860 not_running x switch_in -> running N systemd-oomd-860 [000] event 860 running x switch_out -> not_running Y <idle>-0 [000] event 860 not_running x wakeup -> not_running Y <idle>-0 [000] event 860 not_running x switch_in -> running N systemd-oomd-860 [000] event 860 running x switch_out -> not_running Y <idle>-0 [005] event 71 not_running x wakeup -> not_running Y <idle>-0 [005] event 71 not_running x switch_in -> running N kcompactd0-71 [005] event 71 running x switch_out -> not_running Y <idle>-0 [000] event 860 not_running x wakeup -> not_running Y <idle>-0 [000] event 860 not_running x switch_in -> running N systemd-oomd-860 [000] event 860 running x switch_out -> not_running Y <idle>-0 [001] event 3613 not_running x wakeup -> not_running Y Link: https://lkml.kernel.org/r/1e57547e3acadda6e23949b2672c89e76ec2ec42.1668180100.git.bristot@kernel.org Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-12-09rv: Add rv toolDaniel Bristot de Oliveira
This is the (user-space) runtime verification tool, named rv. This tool aims to be the interface for in-kernel rv monitors, as well as the home for monitors in user-space (online asynchronous), and in *eBPF. The tool receives a command as the first argument, the current commands are: list - list all available monitors mon - run a given monitor Each monitor is an independent piece of software inside the tool and can have their own arguments. There is no monitor implemented in this patch, it only adds the basic structure of the tool, based on rtla. # rv --help rv version 6.1.0-rc4: help usage: rv command [-h] [command_options] -h/--help: print this menu command: run one of the following command: list: list all available monitors mon: run a monitor [command options]: each command has its own set of options run rv command -h for further information *dot2bpf is the next patch set, depends on this, doing cleanups. Link: https://lkml.kernel.org/r/fb51184f3b95aea0d7bfdc33ec09f4153aee84fa.1668180100.git.bristot@kernel.org Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-12-09rtla: Fix exit status when returning from calls to usage()John Kacur
rtla_usage(), osnoise_usage() and timerlat_usage() all exit with an error status. However when these are called from help, they should exit with a non-error status. Fix this by passing the exit status to the functions. Note, although we remove the subsequent call to exit after calling usage, we leave it in at the end of a function to suppress the compiler warning "control reaches end of a non-void function". Link: https://lkml.kernel.org/r/20221107144313.22470-1-jkacur@redhat.com Signed-off-by: John Kacur <jkacur@redhat.com> Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-12-09MIPS: OCTEON: warn only once if deprecated link status is being usedLadislav Michl
Avoid flooding kernel log with warnings. Fixes: 2c0756d306c2 ("MIPS: OCTEON: warn if deprecated link status is being used") Signed-off-by: Ladislav Michl <ladis@linux-mips.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-12-09MIPS: BCM63xx: Add check for NULL for clk in clk_enableAnastasia Belova
Check clk for NULL before calling clk_enable_unlocked where clk is dereferenced. There is such check in other implementations of clk_enable. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: e7300d04bd08 ("MIPS: BCM63xx: Add support for the Broadcom BCM63xx family of SOCs.") Signed-off-by: Anastasia Belova <abelova@astralinux.ru> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2022-12-09dt-bindings: lcdif: Fix constraints for imx8mpAlexander Stein
i.MX8MP uses 3 clocks, so soften the restrictions for clocks & clock-names. This SoC requires a power-domain for this peripheral to use. Add it as a required property. Fixes: f5419cb0743f ("dt-bindings: lcdif: Add compatible for i.MX8MP") Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com> Link: https://lore.kernel.org/r/20221208140840.3227035-1-alexander.stein@ew.tq-group.com Signed-off-by: Rob Herring <robh@kernel.org>
2022-12-09media: dt-bindings: atmel,isc: Drop unneeded unevaluatedPropertiesRob Herring
The 'port' node schema has both 'additionalProperties' and 'unevaluatedProperties', but only one is necessary. 'additionalProperties' works here, so drop 'unevaluatedProperties' and move 'additionalProperties' next to the $ref. Link: https://lore.kernel.org/r/20221207204406.2810864-1-robh@kernel.org Signed-off-by: Rob Herring <robh@kernel.org>
2022-12-09RDMA/rxe: Fix oops with zero length readsDaisuke Matsuda
The commit 686d348476ee ("RDMA/rxe: Remove unnecessary mr testing") causes a kernel crash. If responder get a zero-byte RDMA Read request, qp->resp.mr is not set in check_rkey() (see IBA C9-88). The mr is NULL in this case, and a NULL pointer dereference occurs as shown below. BUG: kernel NULL pointer dereference, address: 0000000000000010 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: 0002 [#1] PREEMPT SMP PTI CPU: 2 PID: 3622 Comm: python3 Kdump: loaded Not tainted 6.1.0-rc3+ #34 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 RIP: 0010:__rxe_put+0xc/0x60 [rdma_rxe] Code: cc cc cc 31 f6 e8 64 36 1b d3 41 b8 01 00 00 00 44 89 c0 c3 cc cc cc cc 41 89 c0 eb c1 90 0f 1f 44 00 00 41 54 b8 ff ff ff ff <f0> 0f c1 47 10 83 f8 01 74 11 45 31 e4 85 c0 7e 20 44 89 e0 41 5c RSP: 0018:ffffb27bc012ce78 EFLAGS: 00010246 RAX: 00000000ffffffff RBX: ffff9790857b0580 RCX: 0000000000000000 RDX: ffff979080fe145a RSI: 000055560e3e0000 RDI: 0000000000000000 RBP: ffff97909c7dd800 R08: 0000000000000001 R09: e7ce43d97f7bed0f R10: ffff97908b29c300 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: ffff97908b29c300 R15: 0000000000000000 FS: 00007f276f7bd740(0000) GS:ffff9792b5c80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000010 CR3: 0000000114230002 CR4: 0000000000060ee0 Call Trace: <IRQ> read_reply+0xda/0x310 [rdma_rxe] rxe_responder+0x82d/0xe50 [rdma_rxe] do_task+0x84/0x170 [rdma_rxe] tasklet_action_common.constprop.0+0xa7/0x120 __do_softirq+0xcb/0x2ac do_softirq+0x63/0x90 </IRQ> Support a NULL mr during read_reply() Fixes: 686d348476ee ("RDMA/rxe: Remove unnecessary mr testing") Fixes: b5f9a01fae42 ("RDMA/rxe: Fix mr leak in RESPST_ERR_RNR") Link: https://lore.kernel.org/r/20221209045926.531689-1-matsuda-daisuke@fujitsu.com Link: https://lore.kernel.org/r/20221202145713.13152-1-lizhijian@fujitsu.com Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09Merge tag 'v6.1-rc8' into rdma.git for-nextJason Gunthorpe
For dependencies in following patches Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09iommufd: Change the order of MSI setupJason Gunthorpe
Eric points out this is wrong for the rare case of someone using allow_unsafe_interrupts on ARM. We always have to setup the MSI window in the domain if the iommu driver asks for it. Move the iommu_get_msi_cookie() setup to the top of the function and always do it, regardless of the security mode. Add checks to iommufd_device_setup_msi() to ensure the driver is not doing something incomprehensible. No current driver will set both a HW and SW MSI window, or have more than one SW MSI window. Fixes: e8d57210035b ("iommufd: Add kAPI toward external drivers for physical devices") Link: https://lore.kernel.org/r/3-v1-0362a1a1c034+98-iommufd_fixes1_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reported-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09iommufd: Improve a few unclear bits of codeJason Gunthorpe
Correct a few items noticed late in review: - We should assert that the math in batch_clear_carry() doesn't underflow - user->locked should be -1 not 0 sicne we just did mmput - npages should not have been recalculated, it already has that value No functional change. Fixes: 8d160cd4d506 ("iommufd: Algorithms for PFN storage") Link: https://lore.kernel.org/r/2-v1-0362a1a1c034+98-iommufd_fixes1_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reported-by: Binbin Wu <binbin.wu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09iommufd: Fix comment typosJason Gunthorpe
Repair some typos in comments that were noticed late in the review cycle. Fixes: f394576eb11d ("iommufd: PFN handling for iopt_pages") Link: https://lore.kernel.org/r/1-v1-0362a1a1c034+98-iommufd_fixes1_jgg@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reported-by: Binbin Wu <binbin.wu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09Merge tag 'media/v6.1-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fix from Mauro Carvalho Chehab: "A v4l-core fix related to validating DV timings related to video blanking values" * tag 'media/v6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: v4l2-dv-timings.c: fix too strict blanking sanity checks
2022-12-09Merge tag 'soc-fixes-6.1-6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fix from Arnd Bergmann: "One more last minute revert for a boot regression that was found on the popular colibri-imx7" * tag 'soc-fixes-6.1-6' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: Revert "ARM: dts: imx7: Fix NAND controller size-cells"
2022-12-09clk: nomadik: correct struct name kernel-doc warningRandy Dunlap
Use the correct struct name for the kernel-doc notation to prevent a kernel-doc warning: clk-nomadik.c:148: warning: expecting prototype for struct clk_pll1. Prototype was for struct clk_pll instead Fixes: ef6eb322ce57 ("clk: nomadik: implement the Nomadik clocks properly") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: linux-arm-kernel@lists.infradead.org Cc: Michael Turquette <mturquette@baylibre.com> Cc: Stephen Boyd <sboyd@kernel.org> Cc: linux-clk@vger.kernel.org Link: https://lore.kernel.org/r/20221209002016.14776-1-rdunlap@infradead.org Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2022-12-09docs/bpf: Add documentation for BPF_MAP_TYPE_SK_STORAGEDonald Hunter
Add documentation for the BPF_MAP_TYPE_SK_STORAGE including kernel version introduced, usage and examples. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20221209112401.69319-1-donald.hunter@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-12-09regmap-irq: Add handle_mask_sync() callbackWilliam Breathitt Gray
Provide a public callback handle_mask_sync() that drivers can use when they have more complex IRQ masking logic. The default implementation is regmap_irq_handle_mask_sync(), used if the chip doesn't provide its own callback. Cc: Mark Brown <broonie@kernel.org> Signed-off-by: William Breathitt Gray <william.gray@linaro.org> Link: https://lore.kernel.org/r/e083474b3d467a86e6cb53da8072de4515bd6276.1669100542.git.william.gray@linaro.org Signed-off-by: Mark Brown <broonie@kernel.org>
2022-12-09spi: dt-bindings: Convert Synquacer SPI to DT schemaRob Herring
Convert the Socionext Synquacer SPI binding to DT format. Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20221209171644.3351787-1-robh@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
2022-12-09lsm: Fix description of fs_context_parse_paramRoberto Sassu
The fs_context_parse_param hook already has a description, which seems the right one according to the code. Fixes: 8eb687bc8069 ("lsm: Add/fix return values in lsm_hooks.h and fix formatting") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-12-09Merge tag 'timers-v6.2-rc1' of ↵Thomas Gleixner
https://git.linaro.org/people/daniel.lezcano/linux into timers/core Pull clockevent/source driver updates from Daniel Lezcano: - Add DT bindings for the Rockchip rk3128 timer (Johan Jonker) - Change the DT bindings for the npcm7xx timer in order to specify multiple clocks and enable the clock for the timer1 on WPCM450 (Jonathan Neuschäfer) - Fix the timer duration being too long the ARM architected timer in order to prevent an integer overflow leading to a negative value and an immediate interruption (Joe Korty) - Fix an unused pointer warning reported by lkp and some cleanups in the timer TI dm (Tony Lindgren) - Fix a missing call to clk_disable_unprepare() in the error path at init time on the timer TI dm (Yang Yingliang) - Use kstrtobool() instead of strtobool() in the ARM architected timer (Christophe JAILLET) - Add DT bindings for r8a779g0 on Renesas platform (Wolfram Sang) Link: https://lore.kernel.org/all/3c4c3bb2-b849-0c87-0948-8a36984bdde4@linaro.org
2022-12-09x86/vdso: Conditionally export __vdso_sgx_enter_enclave()Nathan Chancellor
Recently, ld.lld moved from '--undefined-version' to '--no-undefined-version' as the default, which breaks building the vDSO when CONFIG_X86_SGX is not set: ld.lld: error: version script assignment of 'LINUX_2.6' to symbol '__vdso_sgx_enter_enclave' failed: symbol not defined __vdso_sgx_enter_enclave is only included in the vDSO when CONFIG_X86_SGX is set. Only export it if it will be present in the final object, which clears up the error. Fixes: 8466436952017 ("x86/vdso: Implement a vDSO for Intel SGX enclave call") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://github.com/ClangBuiltLinux/linux/issues/1756 Link: https://lore.kernel.org/r/20221109000306.1407357-1-nathan@kernel.org