summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2024-01-07SUNRPC: discard sv_refcnt, and svc_get/svc_putNeilBrown
sv_refcnt is no longer useful. lockd and nfs-cb only ever have the svc active when there are a non-zero number of threads, so sv_refcnt mirrors sv_nrthreads. nfsd also keeps the svc active between when a socket is added and when the first thread is started, but we don't really need a refcount for that. We can simply not destroy the svc while there are any permanent sockets attached. So remove sv_refcnt and the get/put functions. Instead of a final call to svc_put(), call svc_destroy() instead. This is changed to also store NULL in the passed-in pointer to make it easier to avoid use-after-free situations. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svc: don't hold reference for poolstats, only mutex.NeilBrown
A future patch will remove refcounting on svc_serv as it is of little use. It is currently used to keep the svc around while the pool_stats file is open. Change this to get the pointer, protected by the mutex, only in seq_start, and the release the mutex in seq_stop. This means that if the nfsd server is stopped and restarted while the pool_stats file it open, then some pool stats info could be from the first instance and some from the second. This might appear odd, but is unlikely to be a problem in practice. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: Implement multi-stage Read completion againChuck Lever
Having an nfsd thread waiting for an RDMA Read completion is problematic if the Read responder (ie, the client) stops responding. We need to go back to handling RDMA Reads by getting the svc scheduler to call svc_rdma_recvfrom() a second time to finish building an RPC message after a Read completion. This is the final patch, and makes several changes that have to happen concurrently: 1. svc_rdma_process_read_list no longer waits for a completion, but simply builds and posts the Read WRs. 2. svc_rdma_read_done() now queues a completed Read on sc_read_complete_q for later processing rather than calling complete(). 3. The completed RPC message is no longer built in the svc_rdma_process_read_list() path. Finishing the message is now done in svc_rdma_recvfrom() when it notices work on the sc_read_complete_q. The "finish building this RPC message" code is removed from the svc_rdma_process_read_list() path. This arrangement avoids the need for an nfsd thread to wait for an RDMA Read non-interruptibly without a timeout. It's basically the same code structure that Tom Tucker used for Read chunks along with some clean-up and modernization. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: Copy construction of svc_rqst::rq_arg to rdma_read_complete()Chuck Lever
Once a set of RDMA Reads are complete, the Read completion handler will poke the transport to trigger a second call to svc_rdma_recvfrom(). recvfrom() will then merge the RDMA Read payloads with the previously received RPC header to form a completed RPC Call message. The new code is copied from the svc_rdma_process_read_list() path. A subsequent patch will make use of this code and remove the code that this was copied from (svc_rdma_rw.c). Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: Add back svcxprt_rdma::sc_read_complete_qChuck Lever
Having an nfsd thread waiting for an RDMA Read completion is problematic if the Read responder (ie, the client) stops responding. We need to go back to handling RDMA Reads by allowing the nfsd thread to return to the svc scheduler, then waking a second thread finish the RPC message once the Read completion fires. As a next step, add a list_head upon which completed Reads are queued. A subsequent patch will make use of this queue. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: Add back svc_rdma_recv_ctxt::rc_pagesChuck Lever
Having an nfsd thread waiting for an RDMA Read completion is problematic if the Read responder (the client) stops responding. We need to go back to handling RDMA Reads by allowing the nfsd thread to return to the svc scheduler, then waking a second thread finish the RPC message once the Read completion fires. To start with, restore the rc_pages field so that RDMA Read pages can be managed across calls to svc_rdma_recvfrom(). Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: De-duplicate completion ID initialization helpersChuck Lever
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: Move the svc_rdma_cc_init() callChuck Lever
Now that the chunk_ctxt for Reads is no longer dynamically allocated it can be initialized once for the life of the object that contains it (struct svc_rdma_recv_ctxt). Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: Update synopsis of svc_rdma_build_read_segment()Chuck Lever
Since the RDMA Read I/O state is now contained in the recv_ctxt, svc_rdma_build_read_segment() can use the recv_ctxt to derive that information rather than the other way around. This removes one usage of the ri_readctxt field, enabling its removal in a subsequent patch. At the same time, the use of ri_rqst can similarly be replaced with a passed-in function parameter. Start with build_read_segment() because it is a common utility function at the bottom of the Read chunk path. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: Move read_info::ri_pageoff into struct svc_rdma_recv_ctxtChuck Lever
Further clean up: move the starting byte offset field into svc_rdma_recv_ctxt. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: Move svc_rdma_read_info::ri_pageno to struct svc_rdma_recv_ctxtChuck Lever
Further clean up: move the page index field into svc_rdma_recv_ctxt. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: Start moving fields out of struct svc_rdma_read_infoChuck Lever
Since the request's svc_rdma_recv_ctxt will stay around for the duration of the RDMA Read operation, the contents of struct svc_rdma_read_info can reside in the request's svc_rdma_recv_ctxt rather than being allocated separately. This will eventually save a call to kmalloc() in a hot path. Start this clean-up by moving the Read chunk's svc_rdma_chunk_ctxt. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: Move struct svc_rdma_chunk_ctxt to svc_rdma.hChuck Lever
Prepare for nestling these into the send and recv ctxts so they no longer have to be allocated dynamically. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: Update some svcrdma DMA-related tracepointsChuck Lever
A send/recv_ctxt already records transport-related information in the cq.id, thus there is no need to record the IP addresses of the transport endpoints. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: DMA error tracepoints should report completion IDsChuck Lever
Update the DMA error flow tracepoints to report the completion ID of the failing context. This ties the wait/failure to a particular operation or request, which is more useful than knowing only the failing transport. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: SQ error tracepoints should report completion IDsChuck Lever
Update the Send Queue's error flow tracepoints to report the completion ID of the waiting or failing context. This ties the wait/failure to a particular operation or request, which is a little more useful than knowing only the transport that is about to close. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07rpcrdma: Introduce a simple cid tracepoint classChuck Lever
De-duplicate some code, making it easier to add new tracepoints that report only a completion ID. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: Add an async version of svc_rdma_send_ctxt_put()Chuck Lever
DMA unmapping can take quite some time, so it should not be handled in a single-threaded completion handler. Defer releasing send_ctxts to the recently-added workqueue. With this patch, DMA unmapping can be handled in parallel, and it does not cause head-of-queue blocking of Send completions. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: Add a utility workqueue to svcrdmaChuck Lever
To handle work in the background, set up an UNBOUND workqueue for svcrdma. Subsequent patches will make use of it. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svcrdma: Eliminate allocation of recv_ctxt objects in backchannelChuck Lever
The svc_rdma_recv_ctxt free list uses a lockless list to avoid the need for a spin lock in the fast path. llist_del_first(), which is used by svc_rdma_recv_ctxt_get(), requires serialization, however, when there are multiple list producers that are unserialized. I mistakenly thought there was only one caller of svc_rdma_recv_ctxt_get() (svc_rdma_refresh_recvs()), thus explicit serialization would not be necessary. But there is another caller: svc_rdma_bc_sendto(), and these two are not serialized against each other. I haven't seen ill effects that I could directly ascribe to a lack of serialization. It's just an observation based on code audit. When DMA-mapping before sending a Reply, the passed-in struct svc_rdma_recv_ctxt is used only for its write and reply PCLs. These are currently always empty in the backchannel case. So, instead of passing a full svc_rdma_recv_ctxt object to svc_rdma_map_reply_msg(), let's pass in just the Write and Reply PCLs. This change makes it unnecessary for the backchannel to acquire a dummy svc_rdma_recv_ctxt object when sending an RPC Call. The need for svc_rdma_recv_ctxt free list serialization is now completely avoided. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07NFSv4, NFSD: move enum nfs_cb_opnum4 to include/linux/nfs4.hChenXiaoSong
Callback operations enum is defined in client and server, move it to common header file. Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Acked-by: Anna Schumaker <Anna.Schumaker@netapp.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07SUNRPC: Remove RQ_SPLICE_OKChuck Lever
This flag is no longer used. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07SUNRPC: Add a server-side API for retrieving an RPC's pseudoflavorChuck Lever
NFSD will use this new API to determine whether nfsd_splice_read is safe to use. This avoids the need to add a dependency to NFSD for CONFIG_SUNRPC_GSS. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07net: stmmac: Make MSI interrupt routine genericSwee Leong Ching
There is no support for per DMA channel interrupt for non-MSI platform, where the MAC's per channel interrupt hooks up to interrupt controller(GIC) through shared peripheral interrupt(SPI) to handle interrupt from TX/RX transmit channel. This patch generalize the existing MSI ISR to also support non-MSI platform. Signed-off-by: Teoh Ji Sheng <ji.sheng.teoh@intel.com> Signed-off-by: Swee Leong Ching <leong.ching.swee@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-07net/sched: simplify tc_action_load_ops parametersPedro Tammela
Instead of using two bools derived from a flags passed as arguments to the parent function of tc_action_load_ops, just pass the flags itself to tc_action_load_ops to simplify its parameters. Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Pedro Tammela <pctammela@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-07RDMA/efa: Add EFA query MR supportMichael Margolin
Add EFA driver uapi definitions and register a new query MR method that currently returns the physical interconnects the device is using to reach the MR. Update admin definitions and efa-abi accordingly. Reviewed-by: Anas Mousa <anasmous@amazon.com> Reviewed-by: Firas Jahjah <firasj@amazon.com> Signed-off-by: Michael Margolin <mrgolin@amazon.com> Link: https://lore.kernel.org/r/20240104095155.10676-1-mrgolin@amazon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-01-05Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2024-01-05 We've added 40 non-merge commits during the last 2 day(s) which contain a total of 73 files changed, 1526 insertions(+), 951 deletions(-). The main changes are: 1) Fix a memory leak when streaming AF_UNIX sockets were inserted into multiple sockmap slots/maps, from John Fastabend. 2) Fix gotol in s390 BPF JIT with large offsets, from Ilya Leoshkevich. 3) Fix reattachment branch in bpf_tracing_prog_attach() and reject the request if there is no valid attach_btf, from Jiri Olsa. 4) Remove deprecated bpfilter kernel leftovers given the project is developed in user space (https://github.com/facebook/bpfilter), from Quentin Deslandes. 5) Relax tracing BPF program recursive attach rules given right now it is not possible to create tracing program call cycles, from Dmitrii Dolgov. 6) Fix excessive memory consumption for the bpf_global_percpu_ma for systems with a large number of CPUs, from Yonghong Song. 7) Small x86 BPF JIT cleanup to reuse emit_nops instead of open-coding memcpy of x86_nops, from Leon Hwang. 8) Follow-up for libbpf to support __arg_ctx global function argument tag semantics to complement the merged kernel side, from Andrii Nakryiko. 9) Introduce "volatile compare" macros for BPF selftests in order to make the latter more robust against compiler optimization, from Alexei Starovoitov. 10) Small simplification in verifier's size checking of helper accesses along with additional selftests, from Andrei Matei. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (40 commits) selftests/bpf: Test re-attachment fix for bpf_tracing_prog_attach bpf: Fix re-attachment branch in bpf_tracing_prog_attach selftests/bpf: Add test for recursive attachment of tracing progs bpf: Relax tracing prog recursive attach rules bpf, x86: Use emit_nops to replace memcpy x86_nops selftests/bpf: Test gotol with large offsets selftests/bpf: Double the size of test_loader log s390/bpf: Fix gotol with large offsets bpfilter: remove bpfilter bpf: Remove unnecessary cpu == 0 check in memalloc selftests/bpf: add __arg_ctx BTF rewrite test selftests/bpf: add arg:ctx cases to test_global_funcs tests libbpf: implement __arg_ctx fallback logic libbpf: move BTF loading step after relocation step libbpf: move exception callbacks assignment logic into relocation step libbpf: use stable map placeholder FDs libbpf: don't rely on map->fd as an indicator of map being created libbpf: use explicit map reuse flag to skip map creation steps libbpf: make uniform use of btf__fd() accessor inside libbpf selftests/bpf: Add a selftest with > 512-byte percpu allocation size ... ==================== Link: https://lore.kernel.org/r/20240105170105.21070-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-05Merge branch 'for-6.8/cxl-misc' into for-6.8/cxlDan Williams
Pick up some miscellaneous fixups for v6.8.
2024-01-05cxl/events: Promote CXL event structures to a core headerIra Weiny
UEFI code can process CXL events through CPER records. Those records use almost the same format as the CXL events. Lift the CXL event structures to a core header to be shared in later patches. [jic123: drop "CXL rev 3.0" mention] Signed-off-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20231220-cxl-cper-v5-2-1bb8a4ca2c7a@intel.com [djbw: add F: entry to maintainers for include/linux/cxl-event.h] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-01-05asm-generic: Fix 32 bit __generic_cmpxchg_localDavid McKay
Commit 656e9007ef58 ("asm-generic: avoid __generic_cmpxchg_local warnings") introduced a typo that means the code is incorrect for 32 bit values. It will work fine for postive numbers, but will fail for negative numbers on a system where longs are 64 bit. Fixes: 656e9007ef58 ("asm-generic: avoid __generic_cmpxchg_local warnings") Signed-off-by: David McKay <david.mckay@codasip.com> Signed-off-by: Stuart Menefy <stuart.menefy@codasip.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-01-05mm/vmstat: move pgdemote_* out of CONFIG_NUMA_BALANCINGLi Zhijian
Demotion can work well without CONFIG_NUMA_BALANCING. But the commit 23e9f0138963 ("mm/vmstat: move pgdemote_* to per-node stats") wrongly hid it behind CONFIG_NUMA_BALANCING. Fix it by moving them out of CONFIG_NUMA_BALANCING. Link: https://lkml.kernel.org/r/20231229022651.3229174-1-lizhijian@fujitsu.com Fixes: 23e9f0138963 ("mm/vmstat: move pgdemote_* to per-node stats") Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-05mm/memcontrol: remove __mod_lruvec_page_state()Matthew Wilcox (Oracle)
There are no more callers of __mod_lruvec_page_state(), so convert the implementation to __lruvec_stat_mod_folio(), removing two calls to compound_head() (one explicit, one hidden inside page_memcg()). Link: https://lkml.kernel.org/r/20231228085748.1083901-7-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Zi Yan <ziy@nvidia.com> Acked-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-05mm: remove inc/dec lruvec page state functionsMatthew Wilcox (Oracle)
Patch series "Remove some lruvec page accounting functions", v2. Some functions are now unused; remove them. Make __mod_lruvec_page_state() unused and then remove it. This patch (of 6): All callers of these have been converted to their folio equivalents. Link: https://lkml.kernel.org/r/20231228085748.1083901-1-willy@infradead.org Link: https://lkml.kernel.org/r/20231228085748.1083901-2-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-05mm/mglru: add dummy pmd_dirty()Kinsey Ho
Add dummy pmd_dirty() for architectures that don't provide it. This is similar to commit 6617da8fb565 ("mm: add dummy pmd_young() for architectures not having it"). Link: https://lkml.kernel.org/r/20231227141205.2200125-5-kinseyho@google.com Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202312210606.1Etqz3M4-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202312210042.xQEiqlEh-lkp@intel.com/ Signed-off-by: Kinsey Ho <kinseyho@google.com> Suggested-by: Yu Zhao <yuzhao@google.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Donet Tom <donettom@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-05mm/mglru: remove CONFIG_MEMCGKinsey Ho
Remove CONFIG_MEMCG in a refactoring to improve code readability at the cost of a few bytes in struct lru_gen_folio per node when CONFIG_MEMCG=n. Link: https://lkml.kernel.org/r/20231227141205.2200125-4-kinseyho@google.com Signed-off-by: Kinsey Ho <kinseyho@google.com> Co-developed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Tested-by: Donet Tom <donettom@linux.vnet.ibm.com> Acked-by: Yu Zhao <yuzhao@google.com> Cc: kernel test robot <lkp@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-05mm/mglru: add CONFIG_LRU_GEN_WALKS_MMUKinsey Ho
Add CONFIG_LRU_GEN_WALKS_MMU such that if disabled, the code that walks page tables to promote pages into the youngest generation will not be built. Also improves code readability by adding two helper functions get_mm_state() and get_next_mm(). Link: https://lkml.kernel.org/r/20231227141205.2200125-3-kinseyho@google.com Signed-off-by: Kinsey Ho <kinseyho@google.com> Co-developed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Tested-by: Donet Tom <donettom@linux.vnet.ibm.com> Acked-by: Yu Zhao <yuzhao@google.com> Cc: kernel test robot <lkp@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-05mm/mglru: add CONFIG_ARCH_HAS_HW_PTE_YOUNGKinsey Ho
Patch series "mm/mglru: Kconfig cleanup", v4. This series is the result of the following discussion: https://lore.kernel.org/47066176-bd93-55dd-c2fa-002299d9e034@linux.ibm.com/ It mainly avoids building the code that walks page tables on CPUs that use it, i.e., those don't support hardware accessed bit. Specifically, it introduces a new Kconfig to guard some of functions added by commit bd74fdaea146 ("mm: multi-gen LRU: support page table walks") on CPUs like POWER9, on which the series was tested. This patch (of 5): Some architectures are able to set the accessed bit in PTEs when PTEs are used as part of linear address translations. Add CONFIG_ARCH_HAS_HW_PTE_YOUNG for such architectures to be able to override arch_has_hw_pte_young(). Link: https://lkml.kernel.org/r/20231227141205.2200125-1-kinseyho@google.com Link: https://lkml.kernel.org/r/20231227141205.2200125-2-kinseyho@google.com Signed-off-by: Kinsey Ho <kinseyho@google.com> Co-developed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Tested-by: Donet Tom <donettom@linux.vnet.ibm.com> Acked-by: Yu Zhao <yuzhao@google.com> Cc: kernel test robot <lkp@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-05mm/rmap: silence VM_WARN_ON_FOLIO() in __folio_rmap_sanity_checks()David Hildenbrand
Unfortunately, vm_insert_page() and friends and up passing driver-allocated folios into folio_add_file_rmap_pte() using insert_page_into_pte_locked(). While these driver-allocated folios can be compound pages (large folios), they are not proper "rmappable" folios. In these VM_MIXEDMAP VMAs, there isn't really the concept of a reverse mapping, so long-term, we should clean that up and not call into rmap code. For the time being, document how we can end up in rmap code with large folios that are not marked rmappable. Link: https://lkml.kernel.org/r/793c5cee-d5fc-4eb1-86a2-39e05686233d@redhat.com Fixes: 68f0320824fa ("mm/rmap: convert folio_add_file_rmap_range() into folio_add_file_rmap_[pte|ptes|pmd]()") Reported-by: syzbot+50ef73537bbc393a25bb@syzkaller.appspotmail.com Closes: https://lkml.kernel.org/r/000000000000014174060e09316e@google.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Yin Fengwei <fengwei.yin@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-01-05dpll: expose fractional frequency offset value to userJiri Pirko
Add a new netlink attribute to expose fractional frequency offset value for a pin. Add an op to get the value from the driver. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Acked-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Acked-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Link: https://lore.kernel.org/r/20240103132838.1501801-2-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-05netfs: Fix interaction between write-streaming and cachefiles cullingDavid Howells
An issue can occur between write-streaming (storing dirty data in partial non-uptodate pages) and a cachefiles object being culled to make space. The problem occurs because the cache object is only marked in use while there are files open using it. Once it has been released, it can be culled and the cookie marked disabled. At this point, a streaming write is permitted to occur (if the cache is active, we require pages to be prefetched and cached), but the cache can become active again before this gets flushed out - and then two effects can occur: (1) The cache may be asked to write out a region that's less than its DIO block size (assumed by cachefiles to be PAGE_SIZE) - and this causes one of two debugging statements to be emitted. (2) netfs_how_to_modify() gets confused because it sees a page that isn't allowed to be non-uptodate being uptodate and tries to prefetch it - leading to a warning that PG_fscache is set twice. Fix this by the following means: (1) Add a netfs_inode flag to disallow write-streaming to an inode and set it if we ever do local caching of that inode. It remains set for the lifetime of that inode - even if the cookie becomes disabled. (2) If the no-write-streaming flag is set, then make netfs_how_to_modify() always want to prefetch instead. (3) If netfs_how_to_modify() decides it wants to prefetch a folio, but that folio has write-streamed data in it, then it requires the folio be flushed first. (4) Export a counter of the number of times we wanted to prefetch a non-uptodate page, but found it had write-streamed data in it. (5) Export a counter of the number of times we cancelled a write to the cache because it didn't DIO align and remove the debug statements. Reported-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-erofs@lists.ozlabs.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
2024-01-05ASoC: tas2781: Add tas2563 into header file for DSP modeShenghao Ding
Move tas2563 from tas2562 header file to tas2781 header file to unbind tas2563 from tas2562 driver code and bind it to tas2781 driver code, because tas2563 only work in bypass-DSP mode with tas2562 driver. In order to enable DSP mode for tas2563, it has been moved to tas2781 driver. As to the hardware part, such as register setting and DSP firmware, all these are stored in the binary firmware. What tas2781 drivder does is to parse the firmware and download it to the chip, then power on the chip. So, tas2781 driver can be resued as tas2563 driver. Only attention will be paid to downloading corresponding firmware. Signed-off-by: Shenghao Ding <shenghao-ding@ti.com> Link: https://msgid.link/r/20240104145721.1398-3-shenghao-ding@ti.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-01-05pwm: linux/pwm.h: fix Excess kernel-doc description warningRandy Dunlap
Remove the @pwm: line to prevent the kernel-doc warning: include/linux/pwm.h:87: warning: Excess struct member 'pwm' description in 'pwm_device' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Cc: <linux-pwm@vger.kernel.org> Fixes: f3e25e68ceb2 ("pwm: Drop unused member "pwm" from struct pwm_device") Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
2024-01-05pwm: Add pwm_apply_state() compatibility stubThierry Reding
In order to make the transition to the new pwm_apply_might_sleep() a bit smoother, add a compatibility stub. This will prevent new calls to the old function introduced via other subsystems from breaking builds. Once the next merge window has closed we can take another stab at removing the stub. Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
2024-01-04jbd2: remove unused 'JBD2_CHECKPOINT_IO_ERROR' and 'j_atomic_flags'Zhihao Cheng
Since 'JBD2_CHECKPOINT_IO_ERROR' and j_atomic_flags' are not useful anymore after fs dev's errseq is imported into jbd2, just remove them. Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20231213013224.2100050-4-chengzhihao1@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-01-04jbd2: add errseq to detect client fs's bdev writeback errorZhihao Cheng
Add errseq in journal, so that JBD2 can detect whether metadata is successfully written to fs bdev. This patch adds detection in recovery process to replace original solution(using local variable wb_err). Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Suggested-by: Jan Kara <jack@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20231213013224.2100050-2-chengzhihao1@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-01-04bpf: Relax tracing prog recursive attach rulesDmitrii Dolgov
Currently, it's not allowed to attach an fentry/fexit prog to another one fentry/fexit. At the same time it's not uncommon to see a tracing program with lots of logic in use, and the attachment limitation prevents usage of fentry/fexit for performance analysis (e.g. with "bpftool prog profile" command) in this case. An example could be falcosecurity libs project that uses tp_btf tracing programs. Following the corresponding discussion [1], the reason for that is to avoid tracing progs call cycles without introducing more complex solutions. But currently it seems impossible to load and attach tracing programs in a way that will form such a cycle. The limitation is coming from the fact that attach_prog_fd is specified at the prog load (thus making it impossible to attach to a program loaded after it in this way), as well as tracing progs not implementing link_detach. Replace "no same type" requirement with verification that no more than one level of attachment nesting is allowed. In this way only one fentry/fexit program could be attached to another fentry/fexit to cover profiling use case, and still no cycle could be formed. To implement, add a new field into bpf_prog_aux to track nested attachment for tracing programs. [1]: https://lore.kernel.org/bpf/20191108064039.2041889-16-ast@kernel.org/ Acked-by: Jiri Olsa <olsajiri@gmail.com> Acked-by: Song Liu <song@kernel.org> Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com> Link: https://lore.kernel.org/r/20240103190559.14750-2-9erthalion6@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/broadcom/bnxt/bnxt.c e009b2efb7a8 ("bnxt_en: Remove mis-applied code from bnxt_cfg_ntp_filters()") 0f2b21477988 ("bnxt_en: Fix compile error without CONFIG_RFS_ACCEL") https://lore.kernel.org/all/20240105115509.225aa8a2@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-04Merge tag 'wireless-next-2024-01-03' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Johannes Berg says: ==================== Just a couple of more things over the holidays: - first kunit tests for both cfg80211 and mac80211 - a few multi-link fixes - DSCP mapping update - RCU fix * tag 'wireless-next-2024-01-03' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: wifi: mac80211: remove redundant ML element check wifi: cfg80211: parse all ML elements in an ML probe response wifi: cfg80211: correct comment about MLD ID wifi: cfg80211: Update the default DSCP-to-UP mapping wifi: cfg80211: tests: add some scanning related tests wifi: mac80211: kunit: extend MFP tests wifi: mac80211: kunit: generalize public action test wifi: mac80211: add kunit tests for public action handling kunit: add a convenience allocation wrapper for SKBs kunit: add parameter generation macro using description from array wifi: mac80211: fix spelling typo in comment wifi: cfg80211: fix RCU dereference in __cfg80211_bss_update ==================== Link: https://lore.kernel.org/r/20240103144423.52269-3-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-04Merge tag 'net-6.7-rc9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from wireless and netfilter. We haven't accumulated much over the break. If it wasn't for the uninterrupted stream of fixes for Intel drivers this PR would be very slim. There was a handful of user reports, however, either they stood out because of the lower traffic or users have had more time to test over the break. The ones which are v6.7-relevant should be wrapped up. Current release - regressions: - Revert "net: ipv6/addrconf: clamp preferred_lft to the minimum required", it caused issues on networks where routers send prefixes with preferred_lft=0 - wifi: - iwlwifi: pcie: don't synchronize IRQs from IRQ, prevent deadlock - mac80211: fix re-adding debugfs entries during reconfiguration Current release - new code bugs: - tcp: print AO/MD5 messages only if there are any keys Previous releases - regressions: - virtio_net: fix missing dma unmap for resize, prevent OOM Previous releases - always broken: - mptcp: prevent tcp diag from closing listener subflows - nf_tables: - set transport header offset for egress hook, fix IPv4 mangling - skip set commit for deleted/destroyed sets, avoid double deactivation - nat: make sure action is set for all ct states, fix openvswitch matching on ICMP packets in related state - eth: mlxbf_gige: fix receive hang under heavy traffic - eth: r8169: fix PCI error on system resume for RTL8168FP - net: add missing getsockopt(SO_TIMESTAMPING_NEW) and cmsg handling" * tag 'net-6.7-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (52 commits) net/tcp: Only produce AO/MD5 logs if there are any keys net: Implement missing SO_TIMESTAMPING_NEW cmsg support bnxt_en: Remove mis-applied code from bnxt_cfg_ntp_filters() net: ravb: Wait for operating mode to be applied asix: Add check for usbnet_get_endpoints octeontx2-af: Re-enable MAC TX in otx2_stop processing octeontx2-af: Always configure NIX TX link credits based on max frame size net/smc: fix invalid link access in dumping SMC-R connections net/qla3xxx: fix potential memleak in ql_alloc_buffer_queues virtio_net: fix missing dma unmap for resize igc: Fix hicredit calculation ice: fix Get link status data length i40e: Restore VF MSI-X state during PCI reset i40e: fix use-after-free in i40e_aqc_add_filters() net: Save and restore msg_namelen in sock_sendmsg netfilter: nft_immediate: drop chain reference counter on error netfilter: nf_nat: fix action not being set for all ct states net: bcmgenet: Fix FCS generation for fragmented skbuffs mptcp: prevent tcp diag from closing listener subflows MAINTAINERS: add Geliang as reviewer for MPTCP ...
2024-01-05Merge tag 'drm-misc-next-fixes-2024-01-04' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-next One fix for drm/plane to avoid a use-after-free and some additional warnings to prevent more of these occurences, a lock inversion dependency fix and an indentation fix for drm/rockchip, and some doc warning fixes for imagination and gpuvm. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/enhl33v2oeihktta2yfyc4exvezdvm3eexcuwxkethc5ommrjo@lkidkv2kwakq