summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2022-08-29media: v4l2-ctrls: Export default v4l2_ctrl_type_ops callbacksXavier Roumegue
Export the callback functions of the default v4l2 control type operations such as a driver defining its own operations could reuse some of them. Signed-off-by: Xavier Roumegue <xavier.roumegue@oss.nxp.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
2022-08-29media: v4l2-ctrls: optimize type_ops for arraysHans Verkuil
Initializing arrays and validating or checking for equality of arrays is suboptimal since it does this per element. Change the ops to operate on the whole payload to speed up array operations. Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
2022-08-29Merge drm/drm-next into drm-intel-nextJani Nikula
Sync drm-intel-next with v6.0-rc as well as recent drm-intel-gt-next. Since drm-next does not have commit f0c70d41e4e8 ("drm/i915/guc: remove runtime info printing from time stamp logging") yet, only drm-intel-gt-next, will need to do that as part of the merge here to build. Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2022-08-29ethernet: Add helpers to recognize addresses mapped to IP multicastCasper Andersson
IP multicast must sometimes be discriminated from non-IP multicast, e.g. when determining the forwarding behavior of a given group in the presence of multicast router ports on an offloaded bridge. Therefore, provide helpers to identify these groups. Signed-off-by: Casper Andersson <casper.casan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-29genetlink: start to validate reserved header bytesJakub Kicinski
We had historically not checked that genlmsghdr.reserved is 0 on input which prevents us from using those precious bytes in the future. One use case would be to extend the cmd field, which is currently just 8 bits wide and 256 is not a lot of commands for some core families. To make sure that new families do the right thing by default put the onus of opting out of validation on existing families. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Paul Moore <paul@paul-moore.com> (NetLabel) Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-29x86/earlyprintk: Clean up pciserialPeter Zijlstra
While working on a GRUB patch to support PCI-serial, a number of cleanups were suggested that apply to the code I took inspiration from. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci_ids.h Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lkml.kernel.org/r/YwdeyCEtW+wa+QhH@worktop.programming.kicks-ass.net
2022-08-29xfrm: lwtunnel: add lwtunnel support for xfrm interfaces in collect_md modeEyal Birger
Allow specifying the xfrm interface if_id and link as part of a route metadata using the lwtunnel infrastructure. This allows for example using a single xfrm interface in collect_md mode as the target of multiple routes each specifying a different if_id. With the appropriate changes to iproute2, considering an xfrm device ipsec1 in collect_md mode one can for example add a route specifying an if_id like so: ip route add <SUBNET> dev ipsec1 encap xfrm if_id 1 In which case traffic routed to the device via this route would use if_id in the xfrm interface policy lookup. Or in the context of vrf, one can also specify the "link" property: ip route add <SUBNET> dev ipsec1 encap xfrm if_id 1 link_dev eth15 Note: LWT_XFRM_LINK uses NLA_U32 similar to IFLA_XFRM_LINK even though internally "link" is signed. This is consistent with other _LINK attributes in other devices as well as in bpf and should not have an effect as device indexes can't be negative. Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-08-29xfrm: interface: support collect metadata modeEyal Birger
This commit adds support for 'collect_md' mode on xfrm interfaces. Each net can have one collect_md device, created by providing the IFLA_XFRM_COLLECT_METADATA flag at creation. This device cannot be altered and has no if_id or link device attributes. On transmit to this device, the if_id is fetched from the attached dst metadata on the skb. If exists, the link property is also fetched from the metadata. The dst metadata type used is METADATA_XFRM which holds these properties. On the receive side, xfrmi_rcv_cb() populates a dst metadata for each packet received and attaches it to the skb. The if_id used in this case is fetched from the xfrm state, and the link is fetched from the incoming device. This information can later be used by upper layers such as tc, ebpf, and ip rules. Because the skb is scrubed in xfrmi_rcv_cb(), the attachment of the dst metadata is postponed until after scrubing. Similarly, xfrm_input() is adapted to avoid dropping metadata dsts by only dropping 'valid' (skb_valid_dst(skb) == true) dsts. Policy matching on packets arriving from collect_md xfrmi devices is done by using the xfrm state existing in the skb's sec_path. The xfrm_if_cb.decode_cb() interface implemented by xfrmi_decode_session() is changed to keep the details of the if_id extraction tucked away in xfrm_interface.c. Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-08-29net: allow storing xfrm interface metadata in metadata_dstEyal Birger
XFRM interfaces provide the association of various XFRM transformations to a netdevice using an 'if_id' identifier common to both the XFRM data structures (polcies, states) and the interface. The if_id is configured by the controlling entity (usually the IKE daemon) and can be used by the administrator to define logical relations between different connections. For example, different connections can share the if_id identifier so that they pass through the same interface, . However, currently it is not possible for connections using a different if_id to use the same interface while retaining the logical separation between them, without using additional criteria such as skb marks or different traffic selectors. When having a large number of connections, it is useful to have a the logical separation offered by the if_id identifier but use a single network interface. Similar to the way collect_md mode is used in IP tunnels. This patch attempts to enable different configuration mechanisms - such as ebpf programs, LWT encapsulations, and TC - to attach metadata to skbs which would carry the if_id. This way a single xfrm interface in collect_md mode can demux traffic based on this configuration on tx and provide this metadata on rx. The XFRM metadata is somewhat similar to ip tunnel metadata in that it has an "id", and shares similar configuration entities (bpf, tc, ...), however, it does not necessarily represent an IP tunnel or use other ip tunnel information, and also has an optional "link" property which can be used for affecting underlying routing decisions. Additional xfrm related criteria may also be added in the future. Therefore, a new metadata type is introduced, to be used in subsequent patches in the xfrm interface and configuration entities. Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Eyal Birger <eyal.birger@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2022-08-29perf: Add PERF_BR_NEW_ARCH_[N] map for BRBE on arm64 platformAnshuman Khandual
BRBE captured branch types will overflow perf_branch_entry.type and generic branch types in perf_branch_entry.new_type. So override each available arch specific branch type in the following manner to comprehensively process all reported branch types in BRBE. PERF_BR_ARM64_FIQ PERF_BR_NEW_ARCH_1 PERF_BR_ARM64_DEBUG_HALT PERF_BR_NEW_ARCH_2 PERF_BR_ARM64_DEBUG_EXIT PERF_BR_NEW_ARCH_3 PERF_BR_ARM64_DEBUG_INST PERF_BR_NEW_ARCH_4 PERF_BR_ARM64_DEBUG_DATA PERF_BR_NEW_ARCH_5 Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: James Clark <james.clark@arm.com> Link: https://lkml.kernel.org/r/20220824044822.70230-5-anshuman.khandual@arm.com
2022-08-29perf: Capture branch privilege informationAnshuman Khandual
Platforms like arm64 could capture privilege level information for all the branch records. Hence this adds a new element in the struct branch_entry to record the privilege level information, which could be requested through a new event.attr.branch_sample_type based flag PERF_SAMPLE_BRANCH_PRIV_SAVE. This flag helps user choose whether privilege information is captured. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: James Clark <james.clark@arm.com> Link: https://lkml.kernel.org/r/20220824044822.70230-4-anshuman.khandual@arm.com
2022-08-29perf: Extend branch type classificationAnshuman Khandual
branch_entry.type now has ran out of space to accommodate more branch types classification. This will prevent perf branch stack implementation on arm64 (via BRBE) to capture all available branch types. Extending this bit field i.e branch_entry.type [4 bits] is not an option as it will break user space ABI both for little and big endian perf tools. Extend branch classification with a new field branch_entry.new_type via a new branch type PERF_BR_EXTEND_ABI in branch_entry.type. Perf tools which could decode PERF_BR_EXTEND_ABI, will then parse branch_entry.new_type as well. branch_entry.new_type is a 4 bit field which can hold upto 16 branch types. The first three branch types will hold various generic page faults followed by five architecture specific branch types, which can be overridden by the platform for specific use cases. These architecture specific branch types gets overridden on arm64 platform for BRBE implementation. New generic branch types - PERF_BR_NEW_FAULT_ALGN - PERF_BR_NEW_FAULT_DATA - PERF_BR_NEW_FAULT_INST New arch specific branch types - PERF_BR_NEW_ARCH_1 - PERF_BR_NEW_ARCH_2 - PERF_BR_NEW_ARCH_3 - PERF_BR_NEW_ARCH_4 - PERF_BR_NEW_ARCH_5 Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: James Clark <james.clark@arm.com> Link: https://lkml.kernel.org/r/20220824044822.70230-3-anshuman.khandual@arm.com
2022-08-29perf: Add system error and not in transaction branch typesAnshuman Khandual
This expands generic branch type classification by adding two more entries there in i.e system error and not in transaction. This also updates the x86 implementation to process X86_BR_NO_TX records as appropriate. This changes branch types reported to user space on x86 platform but it should not be a problem. The possible scenarios and impacts are enumerated here. -------------------------------------------------------------------------- | kernel | perf tool | Impact | -------------------------------------------------------------------------- | old | old | Works as before | -------------------------------------------------------------------------- | old | new | PERF_BR_UNKNOWN is processed | -------------------------------------------------------------------------- | new | old | PERF_BR_NO_TX is blocked via old PERF_BR_MAX | -------------------------------------------------------------------------- | new | new | PERF_BR_NO_TX is recognized | -------------------------------------------------------------------------- When PERF_BR_NO_TX is blocked via old PERF_BR_MAX (new kernel with old perf tool) the user space might throw up an warning complaining about an unrecognized branch types being reported, but it's expected. PERF_BR_SERROR & PERF_BR_NO_TX branch types will be used for BRBE implementation on arm64 platform. PERF_BR_NO_TX complements 'abort' and 'in_tx' elements in perf_branch_entry which represent other transaction states for a given branch record. Because this completes the transaction state classification. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: James Clark <james.clark@arm.com> Link: https://lkml.kernel.org/r/20220824044822.70230-2-anshuman.khandual@arm.com
2022-08-28Merge tag 'mm-hotfixes-stable-2022-08-28' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more hotfixes from Andrew Morton: "Seventeen hotfixes. Mostly memory management things. Ten patches are cc:stable, addressing pre-6.0 issues" * tag 'mm-hotfixes-stable-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: .mailmap: update Luca Ceresoli's e-mail address mm/mprotect: only reference swap pfn page if type match squashfs: don't call kmalloc in decompressors mm/damon/dbgfs: avoid duplicate context directory creation mailmap: update email address for Colin King asm-generic: sections: refactor memory_intersects bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem ocfs2: fix freeing uninitialized resource on ocfs2_dlm_shutdown Revert "memcg: cleanup racy sum avoidance code" mm/zsmalloc: do not attempt to free IS_ERR handle binder_alloc: add missing mmap_lock calls when using the VMA mm: re-allow pinning of zero pfns (again) vmcoreinfo: add kallsyms_num_syms symbol mailmap: update Guilherme G. Piccoli's email addresses writeback: avoid use-after-free after removing device shmem: update folio if shmem_replace_page() updates the page mm/hugetlb: avoid corrupting page->mapping in hugetlb_mcopy_atomic_pte
2022-08-28asm-generic: sections: refactor memory_intersectsQuanyang Wang
There are two problems with the current code of memory_intersects: First, it doesn't check whether the region (begin, end) falls inside the region (virt, vend), that is (virt < begin && vend > end). The second problem is if vend is equal to begin, it will return true but this is wrong since vend (virt + size) is not the last address of the memory region but (virt + size -1) is. The wrong determination will trigger the misreporting when the function check_for_illegal_area calls memory_intersects to check if the dma region intersects with stext region. The misreporting is as below (stext is at 0x80100000): WARNING: CPU: 0 PID: 77 at kernel/dma/debug.c:1073 check_for_illegal_area+0x130/0x168 DMA-API: chipidea-usb2 e0002000.usb: device driver maps memory from kernel text or rodata [addr=800f0000] [len=65536] Modules linked in: CPU: 1 PID: 77 Comm: usb-storage Not tainted 5.19.0-yocto-standard #5 Hardware name: Xilinx Zynq Platform unwind_backtrace from show_stack+0x18/0x1c show_stack from dump_stack_lvl+0x58/0x70 dump_stack_lvl from __warn+0xb0/0x198 __warn from warn_slowpath_fmt+0x80/0xb4 warn_slowpath_fmt from check_for_illegal_area+0x130/0x168 check_for_illegal_area from debug_dma_map_sg+0x94/0x368 debug_dma_map_sg from __dma_map_sg_attrs+0x114/0x128 __dma_map_sg_attrs from dma_map_sg_attrs+0x18/0x24 dma_map_sg_attrs from usb_hcd_map_urb_for_dma+0x250/0x3b4 usb_hcd_map_urb_for_dma from usb_hcd_submit_urb+0x194/0x214 usb_hcd_submit_urb from usb_sg_wait+0xa4/0x118 usb_sg_wait from usb_stor_bulk_transfer_sglist+0xa0/0xec usb_stor_bulk_transfer_sglist from usb_stor_bulk_srb+0x38/0x70 usb_stor_bulk_srb from usb_stor_Bulk_transport+0x150/0x360 usb_stor_Bulk_transport from usb_stor_invoke_transport+0x38/0x440 usb_stor_invoke_transport from usb_stor_control_thread+0x1e0/0x238 usb_stor_control_thread from kthread+0xf8/0x104 kthread from ret_from_fork+0x14/0x2c Refactor memory_intersects to fix the two problems above. Before the 1d7db834a027e ("dma-debug: use memory_intersects() directly"), memory_intersects is called only by printk_late_init: printk_late_init -> init_section_intersects ->memory_intersects. There were few places where memory_intersects was called. When commit 1d7db834a027e ("dma-debug: use memory_intersects() directly") was merged and CONFIG_DMA_API_DEBUG is enabled, the DMA subsystem uses it to check for an illegal area and the calltrace above is triggered. [akpm@linux-foundation.org: fix nearby comment typo] Link: https://lkml.kernel.org/r/20220819081145.948016-1-quanyang.wang@windriver.com Fixes: 979559362516 ("asm/sections: add helpers to check for section data") Signed-off-by: Quanyang Wang <quanyang.wang@windriver.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Thierry Reding <treding@nvidia.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-28Revert "memcg: cleanup racy sum avoidance code"Shakeel Butt
This reverts commit 96e51ccf1af33e82f429a0d6baebba29c6448d0f. Recently we started running the kernel with rstat infrastructure on production traffic and begin to see negative memcg stats values. Particularly the 'sock' stat is the one which we observed having negative value. $ grep "sock " /mnt/memory/job/memory.stat sock 253952 total_sock 18446744073708724224 Re-run after couple of seconds $ grep "sock " /mnt/memory/job/memory.stat sock 253952 total_sock 53248 For now we are only seeing this issue on large machines (256 CPUs) and only with 'sock' stat. I think the networking stack increase the stat on one cpu and decrease it on another cpu much more often. So, this negative sock is due to rstat flusher flushing the stats on the CPU that has seen the decrement of sock but missed the CPU that has increments. A typical race condition. For easy stable backport, revert is the most simple solution. For long term solution, I am thinking of two directions. First is just reduce the race window by optimizing the rstat flusher. Second is if the reader sees a negative stat value, force flush and restart the stat collection. Basically retry but limited. Link: https://lkml.kernel.org/r/20220817172139.3141101-1-shakeelb@google.com Fixes: 96e51ccf1af33e8 ("memcg: cleanup racy sum avoidance code") Signed-off-by: Shakeel Butt <shakeelb@google.com> Cc: "Michal Koutný" <mkoutny@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Muchun Song <songmuchun@bytedance.com> Cc: David Hildenbrand <david@redhat.com> Cc: Yosry Ahmed <yosryahmed@google.com> Cc: Greg Thelen <gthelen@google.com> Cc: <stable@vger.kernel.org> [5.15] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-28mm: re-allow pinning of zero pfns (again)Alex Williamson
The below referenced commit makes the same error as 1c563432588d ("mm: fix is_pinnable_page against a cma page"), re-interpreting the logic to exclude pinning of the zero page, which breaks device assignment with vfio. To avoid further subtle mistakes, split the logic into discrete tests. [akpm@linux-foundation.org: simplify comment, per John] Link: https://lkml.kernel.org/r/166015037385.760108.16881097713975517242.stgit@omen Link: https://lore.kernel.org/all/165490039431.944052.12458624139225785964.stgit@omen Fixes: f25cbb7a95a2 ("mm: add zone device coherent type memory support") Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Suggested-by: Matthew Wilcox <willy@infradead.org> Suggested-by: Felix Kuehling <felix.kuehling@amd.com> Tested-by: Slawomir Laba <slawomirx.laba@intel.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Cc: Alex Sierra <alex.sierra@amd.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Alistair Popple <apopple@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-28lib/string_helpers: Add str_read_write() helperAndy Shevchenko
Add str_read_write() helper to return 'read' or 'write' string literal. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Dmitry Rokosov <ddrokosov@sberdevices.ru> Link: https://lore.kernel.org/r/20220822175011.2886-2-ddrokosov@sberdevices.ru Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-08-28units: complement the set of Hz unitsDmitry Rokosov
Currently, Hz units do not have milli, micro and nano Hz coefficients. Some drivers (IIO especially) use their analogues to calculate appropriate Hz values. This patch includes them to units.h definitions, so they can be used from different kernel places. Signed-off-by: Dmitry Rokosov <ddrokosov@sberdevices.ru> Link: https://lore.kernel.org/r/20220812165243.22177-3-ddrokosov@sberdevices.ru Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-08-27drm/gem: Add LRU/shrinker helperRob Clark
Add a simple LRU helper to assist with driver's shrinker implementation. It handles tracking the number of backing pages associated with a given LRU, and provides a helper to implement shrinker_scan. A driver can use multiple LRU instances to track objects in various states, for example a dontneed LRU for purgeable objects, a willneed LRU for evictable objects, and an unpinned LRU for objects without backing pages. All LRUs that the object can be moved between must share a single lock. v2: lockdep_assert_held() instead of WARN_ON(!mutex_is_locked()) v3: make drm_gem_lru_move_tail_locked() static until there is a user Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Patchwork: https://patchwork.freedesktop.org/patch/496128/ Link: https://lore.kernel.org/r/20220802155152.1727594-10-robdclark@gmail.com
2022-08-26bpf: Fix a few typos in BPF helpers documentationQuentin Monnet
Address a few typos in the documentation for the BPF helper functions. They were reported by Jakub [0], who ran spell checkers on the generated man page [1]. [0] https://lore.kernel.org/linux-man/d22dcd47-023c-8f52-d369-7b5308e6c842@gmail.com/T/#mb02e7d4b7fb61d98fa914c77b581184e9a9537af [1] https://lore.kernel.org/linux-man/eb6a1e41-c48e-ac45-5154-ac57a2c76108@gmail.com/T/#m4a8d1b003616928013ffcd1450437309ab652f9f v3: Do not copy unrelated (and breaking) elements to tools/ header v2: Turn a ',' into a ';' Reported-by: Jakub Wilk <jwilk@jwilk.net> Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220825220806.107143-1-quentin@isovalent.com
2022-08-26openvswitch: allow specifying ifindex of new interfacesAndrey Zhadchenko
CRIU is preserving ifindexes of net devices after restoration. However, current Open vSwitch API does not allow to target ifindex, so we cannot correctly restore OVS configuration. Add new OVS_DP_ATTR_IFINDEX for OVS_DP_CMD_NEW and use it as desired ifindex. Use OVS_VPORT_ATTR_IFINDEX during OVS_VPORT_CMD_NEW to specify new netdev ifindex. Signed-off-by: Andrey Zhadchenko <andrey.zhadchenko@virtuozzo.com> Acked-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-27perf/core: Add speculation info to branch entriesSandipan Das
Add a new "spec" bitfield to branch entries for providing speculation information. This will be populated using hints provided by branch sampling features on supported hardware. The following cases are covered: * No branch speculation information is available * Branch is speculative but taken on the wrong path * Branch is non-speculative but taken on the correct path * Branch is speculative and taken on the correct path Suggested-by: Stephane Eranian <eranian@google.com> Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/834088c302faf21c7b665031dd111f424e509a64.1660211399.git.sandipan.das@amd.com
2022-08-26Merge branch 'for-6.0-fixes' into for-6.1Tejun Heo
Pulling to receive 43626dade36f ("group: Add missing cpus_read_lock() to cgroup_attach_task_all()") for a follow-up patch.
2022-08-26cgroup: Homogenize cgroup_get_from_id() return valueMichal Koutný
Cgroup id is user provided datum hence extend its return domain to include possible error reason (similar to cgroup_get_from_fd()). This change also fixes commit d4ccaf58a847 ("bpf: Introduce cgroup iter") that would use NULL instead of proper error handling in d4ccaf58a847 ("bpf: Introduce cgroup iter"). Additionally, neither of: fc_appid_store, bpf_iter_attach_cgroup, mem_cgroup_get_from_ino (callers of cgroup_get_from_fd) is built without CONFIG_CGROUPS (depends via CONFIG_BLK_CGROUP, direct, transitive CONFIG_MEMCG respectively) transitive, so drop the singular definition not needed with !CONFIG_CGROUPS. Fixes: d4ccaf58a847 ("bpf: Introduce cgroup iter") Signed-off-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2022-08-26Merge tag 'io_uring-6.0-2022-08-26' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring fixes from Jens Axboe: - Add missing header file to the MAINTAINERS entry for io_uring (Ammar) - liburing and the kernel ship the same io_uring.h header, but one change we've had for a long time only in liburing is to ensure it's C++ safe. Add extern C around it, so we can more easily sync them in the future (Ammar) - Fix an off-by-one in the sync cancel added in this merge window (me) - Error handling fix for passthrough (Kanchan) - Fix for address saving for async execution for the zc tx support (Pavel) - Fix ordering for TCP zc notifications, so we always have them ordered correctly between "data was sent" and "data was acked". This isn't strictly needed with the notification slots, but we've been pondering disabling the slot support for 6.0 - and if we do, then we do require the ordering to be sane. Regardless of that, it's the sane thing to do in terms of API (Pavel) - Minor cleanup for indentation and lockdep annotation (Pavel) * tag 'io_uring-6.0-2022-08-26' of git://git.kernel.dk/linux-block: io_uring/net: save address for sendzc async execution io_uring: conditional ->async_data allocation io_uring/notif: order notif vs send CQEs io_uring/net: fix indentation io_uring/net: fix zc send link failing io_uring/net: fix must_hold annotation io_uring: fix submission-failure handling for uring-cmd io_uring: fix off-by-one in sync cancelation file check io_uring: uapi: Add `extern "C"` in io_uring.h for liburing MAINTAINERS: Add `include/linux/io_uring_types.h`
2022-08-26Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Ten fixes. Of the three core changes, the two large ones are a complete reversion of the async rework and an ALUA timing rework (the latter shouldn't affect non-ALUA paths). The remaining patches are all small and all but one in drivers" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: sd: Revert "Rework asynchronous resume support" scsi: core: Fix passthrough retry counter handling scsi: ufs: core: Reduce the power mode change timeout scsi: storvsc: Remove WQ_MEM_RECLAIM from storvsc_error_wq scsi: ufs: host: ufs-exynos: Make fsd_ufs_drvs static scsi: megaraid_sas: Remove unnecessary kfree() scsi: megaraid_sas: Fix double kfree() scsi: ufs: core: Enable link lost interrupt scsi: core: Allow the ALUA transitioning state enough time scsi: qla2xxx: Disable ATIO interrupt coalesce for quad port ISP27XX
2022-08-26wait_on_bit: add an acquire memory barrierMikulas Patocka
There are several places in the kernel where wait_on_bit is not followed by a memory barrier (for example, in drivers/md/dm-bufio.c:new_read). On architectures with weak memory ordering, it may happen that memory accesses that follow wait_on_bit are reordered before wait_on_bit and they may return invalid data. Fix this class of bugs by introducing a new function "test_bit_acquire" that works like test_bit, but has acquire memory ordering semantics. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Acked-by: Will Deacon <will@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-26lsm,io_uring: add LSM hooks for the new uring_cmd file opLuis Chamberlain
io-uring cmd support was added through ee692a21e9bf ("fs,io_uring: add infrastructure for uring-cmd"), this extended the struct file_operations to allow a new command which each subsystem can use to enable command passthrough. Add an LSM specific for the command passthrough which enables LSMs to inspect the command details. This was discussed long ago without no clear pointer for something conclusive, so this enables LSMs to at least reject this new file operation. [0] https://lkml.kernel.org/r/8adf55db-7bab-f59d-d612-ed906b948d19@schaufler-ca.com Cc: stable@vger.kernel.org Fixes: ee692a21e9bf ("fs,io_uring: add infrastructure for uring-cmd") Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Acked-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-08-26Merge branch 'i2c/make_remove_callback_void-immutable' of ↵Linus Walleij
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux into devel This branch is needed to make the i2c driver remove() callback in new driver compile properly. Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2022-08-26drm/i915: Add new ADL-S pci idJosé Roberto de Souza
New PCI id recently added. BSpec: 53655 Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Madhumitha Tolakanahalli Pradeep <madhumitha.tolakanahalli.pradeep@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220824133935.51560-1-jose.souza@intel.com
2022-08-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel borkmann says: ==================== The following pull-request contains BPF updates for your *net* tree. We've added 11 non-merge commits during the last 14 day(s) which contain a total of 13 files changed, 61 insertions(+), 24 deletions(-). The main changes are: 1) Fix BPF verifier's precision tracking around BPF ring buffer, from Kumar Kartikeya Dwivedi. 2) Fix regression in tunnel key infra when passing FLOWI_FLAG_ANYSRC, from Eyal Birger. 3) Fix insufficient permissions for bpf_sys_bpf() helper, from YiFei Zhu. 4) Fix splat from hitting BUG when purging effective cgroup programs, from Pu Lehui. 5) Fix range tracking for array poke descriptors, from Daniel Borkmann. 6) Fix corrupted packets for XDP_SHARED_UMEM in aligned mode, from Magnus Karlsson. 7) Fix NULL pointer splat in BPF sockmap sk_msg_recvmsg(), from Liu Jian. 8) Add READ_ONCE() to bpf_jit_limit when reading from sysctl, from Kuniyuki Iwashima. 9) Add BPF selftest lru_bug check to s390x deny list, from Daniel Müller. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-26net: sched: remove unnecessary init of qdisc skb headZhengchao Shao
The memory allocated by using kzallloc_node and kcalloc has been cleared. Therefore, the structure members of the new qdisc are 0. So there's no need to explicitly assign a value of 0. Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-26Merge tag 'wireless-next-2022-08-26-v2' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Johannes berg says: ==================== Various updates: * rtw88: operation, locking, warning, and code style fixes * rtw89: small updates * cfg80211/mac80211: more EHT/MLO (802.11be, WiFi 7) work * brcmfmac: a couple of fixes * misc cleanups etc. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-26dt-bindings: clock: Add AST2500/AST2600 HACE reset definitionNeal Liu
Add HACE reset bit definition for AST2500/AST2600. Signed-off-by: Neal Liu <neal_liu@aspeedtech.com> Signed-off-by: Johnny Huang <johnny_huang@aspeedtech.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-08-26platform/x86: asus-wmi: Implement TUF laptop keyboard power statesLuke D. Jones
Adds support for setting various power states of TUF keyboards. These states are combinations of: - boot, set if a boot animation is shown on keyboard - awake, set if the keyboard LEDs are visible while laptop is on - sleep, set if an animation is displayed while the laptop is suspended - keyboard (unknown effect) Adds two sysfs attributes to asus::kbd_backlight: - kbd_rgb_state - kbd_rgb_state_index Signed-off-by: Luke D. Jones <luke@ljones.dev> Link: https://lore.kernel.org/r/20220825232251.345893-3-luke@ljones.dev Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-08-26platform/x86: asus-wmi: Implement TUF laptop keyboard LED modesLuke D. Jones
Adds support for changing the laptop keyboard LED mode and colour. The modes are visible effects such as static, rainbow, pulsing, colour cycles. These sysfs attributes are added to asus::kbd_backlight: - kbd_rgb_mode - kbd_rgb_mode_index Signed-off-by: Luke D. Jones <luke@ljones.dev> Link: https://lore.kernel.org/r/20220825232251.345893-2-luke@ljones.dev Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-08-26platform/x86: asus-wmi: Support the GPU fan on TUF laptopsLuke D. Jones
Add support for TUF laptops which have the ability to control the GPU fan. This will show as a second fan in hwmon, and has the ability to run as boost (fullspeed), or auto. Signed-off-by: Luke D. Jones <luke@ljones.dev> Link: https://lore.kernel.org/r/20220826004210.356534-3-luke@ljones.dev Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-08-25bpf: Add CGROUP prefix to cgroup_iter_orderHao Luo
bpf_cgroup_iter_order is globally visible but the entries do not have CGROUP prefix. As requested by Andrii, put a CGROUP in the names in bpf_cgroup_iter_order. This patch fixes two previous commits: one introduced the API and the other uses the API in bpf selftest (that is, the selftest cgroup_hierarchical_stats). I tested this patch via the following command: test_progs -t cgroup,iter,btf_dump Fixes: d4ccaf58a847 ("bpf: Introduce cgroup iter") Fixes: 88886309d2e8 ("selftests/bpf: add a selftest for cgroup hierarchical stats collection") Suggested-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/r/20220825223936.1865810-1-haoluo@google.com Signed-off-by: Martin KaFai Lau <kafai@fb.com>
2022-08-25Bluetooth: convert hci_update_adv_data to hci_syncBrian Gix
hci_update_adv_data() is called from hci_event and hci_core due to events from the controller. The prior function used the deprecated hci_request method, and the new one uses hci_sync.c Signed-off-by: Brian Gix <brian.gix@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-08-25Bluetooth: move hci_get_random_address() to hci_syncBrian Gix
This function has no dependencies on the deprecated hci_request mechanism, so has been moved unchanged to hci_sync.c Signed-off-by: Brian Gix <brian.gix@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-08-25Bluetooth: Move Adv Instance timer to hci_syncBrian Gix
The Advertising Instance expiration timer adv_instance_expire was handled with the deprecated hci_request mechanism, rather than it's replacement: hci_sync. Signed-off-by: Brian Gix <brian.gix@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2022-08-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
drivers/net/ethernet/mellanox/mlx5/core/en_fs.c 21234e3a84c7 ("net/mlx5e: Fix use after free in mlx5e_fs_init()") c7eafc5ed068 ("net/mlx5e: Convert ethtool_steering member of flow_steering struct to pointer") https://lore.kernel.org/all/20220825104410.67d4709c@canb.auug.org.au/ https://lore.kernel.org/all/20220823055533.334471-1-saeed@kernel.org/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-26ata: libata: Rename ATA_DFLAG_NCQ_PRIO_ENABLEDamien Le Moal
Rename ATA_DFLAG_NCQ_PRIO_ENABLE to ATA_DFLAG_NCQ_PRIO_ENABLED to match the fact that this flags indicates if NCQ priority use is enabled by the user. Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
2022-08-25netdev: Use try_cmpxchg in napi_if_scheduled_mark_missedUros Bizjak
Use try_cmpxchg instead of cmpxchg (*ptr, old, new) == old in napi_if_scheduled_mark_missed. x86 CMPXCHG instruction returns success in ZF flag, so this change saves a compare after cmpxchg (and related move instruction in front of cmpxchg). Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg fails, enabling further code simplifications. Cc: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Link: https://lore.kernel.org/r/20220822143243.2798-1-ubizjak@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-25Merge tag 'net-6.0-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from ipsec and netfilter (with one broken Fixes tag). Current release - new code bugs: - dsa: don't dereference NULL extack in dsa_slave_changeupper() - dpaa: fix <1G ethernet on LS1046ARDB - neigh: don't call kfree_skb() under spin_lock_irqsave() Previous releases - regressions: - r8152: fix the RX FIFO settings when suspending - dsa: microchip: keep compatibility with device tree blobs with no phy-mode - Revert "net: macsec: update SCI upon MAC address change." - Revert "xfrm: update SA curlft.use_time", comply with RFC 2367 Previous releases - always broken: - netfilter: conntrack: work around exceeded TCP receive window - ipsec: fix a null pointer dereference of dst->dev on a metadata dst in xfrm_lookup_with_ifid - moxa: get rid of asymmetry in DMA mapping/unmapping - dsa: microchip: make learning configurable and keep it off while standalone - ice: xsk: prohibit usage of non-balanced queue id - rxrpc: fix locking in rxrpc's sendmsg Misc: - another chunk of sysctl data race silencing" * tag 'net-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (87 commits) net: lantiq_xrx200: restore buffer if memory allocation failed net: lantiq_xrx200: fix lock under memory pressure net: lantiq_xrx200: confirm skb is allocated before using net: stmmac: work around sporadic tx issue on link-up ionic: VF initial random MAC address if no assigned mac ionic: fix up issues with handling EAGAIN on FW cmds ionic: clear broken state on generation change rxrpc: Fix locking in rxrpc's sendmsg net: ethernet: mtk_eth_soc: fix hw hash reporting for MTK_NETSYS_V2 MAINTAINERS: rectify file entry in BONDING DRIVER i40e: Fix incorrect address type for IPv6 flow rules ixgbe: stop resetting SYSTIME in ixgbe_ptp_start_cyclecounter net: Fix a data-race around sysctl_somaxconn. net: Fix a data-race around netdev_unregister_timeout_secs. net: Fix a data-race around gro_normal_batch. net: Fix data-races around sysctl_devconf_inherit_init_net. net: Fix data-races around sysctl_fb_tunnels_only_for_init_net. net: Fix a data-race around netdev_budget_usecs. net: Fix data-races around sysctl_max_skb_frags. net: Fix a data-race around netdev_budget. ...
2022-08-25net: devlink: limit flash component name to match version returned by info_get()Jiri Pirko
Limit the acceptance of component name passed to cmd_flash_update() to match one of the versions returned by info_get(), marked by version type. This makes things clearer and enforces 1:1 mapping between exposed version and accepted flash component. Check VERSION_TYPE_COMPONENT version type during cmd_flash_update() execution by calling info_get() with different "req" context. That causes info_get() to lookup the component name instead of filling-up the netlink message. Remove "UPDATE_COMPONENT" flag which becomes used. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-25net: devlink: extend info_get() version put to indicate a flash componentJiri Pirko
Whenever the driver is called by his info_get() op, it may put multiple version names and values to the netlink message. Extend by additional helper devlink_info_version_running/stored_put_ext() that allows to specify a version type that indicates when particular version name represents a flash component. This is going to be used in follow-up patch calling info_get() during flash update command checking if version with this the version type exists. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-25of/device: Fix up of_dma_configure_id() stubThierry Reding
Since the stub version of of_dma_configure_id() was added in commit a081bd4af4ce ("of/device: Add input id to of_dma_configure()"), it has not matched the signature of the full function, leading to build failure reports when code using this function is built on !OF configurations. Fixes: a081bd4af4ce ("of/device: Add input id to of_dma_configure()") Cc: stable@vger.kernel.org Signed-off-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Frank Rowand <frank.rowand@sony.com> Acked-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Link: https://lore.kernel.org/r/20220824153256.1437483-1-thierry.reding@gmail.com Signed-off-by: Rob Herring <robh@kernel.org>
2022-08-25bpf: Introduce cgroup iterHao Luo
Cgroup_iter is a type of bpf_iter. It walks over cgroups in four modes: - walking a cgroup's descendants in pre-order. - walking a cgroup's descendants in post-order. - walking a cgroup's ancestors. - process only the given cgroup. When attaching cgroup_iter, one can set a cgroup to the iter_link created from attaching. This cgroup is passed as a file descriptor or cgroup id and serves as the starting point of the walk. If no cgroup is specified, the starting point will be the root cgroup v2. For walking descendants, one can specify the order: either pre-order or post-order. For walking ancestors, the walk starts at the specified cgroup and ends at the root. One can also terminate the walk early by returning 1 from the iter program. Note that because walking cgroup hierarchy holds cgroup_mutex, the iter program is called with cgroup_mutex held. Currently only one session is supported, which means, depending on the volume of data bpf program intends to send to user space, the number of cgroups that can be walked is limited. For example, given the current buffer size is 8 * PAGE_SIZE, if the program sends 64B data for each cgroup, assuming PAGE_SIZE is 4kb, the total number of cgroups that can be walked is 512. This is a limitation of cgroup_iter. If the output data is larger than the kernel buffer size, after all data in the kernel buffer is consumed by user space, the subsequent read() syscall will signal EOPNOTSUPP. In order to work around, the user may have to update their program to reduce the volume of data sent to output. For example, skip some uninteresting cgroups. In future, we may extend bpf_iter flags to allow customizing buffer size. Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/r/20220824233117.1312810-2-haoluo@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>