summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-12-23io_uring: use region api for SQPavel Begunkov
Convert internal parts of the SQ managment to the region API. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1fb73ced6b835cb319ab0fe1dc0b2e982a9a5650.1732886067.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-23io_uring: pass ctx to io_register_free_ringsPavel Begunkov
A preparation patch, pass the context to io_register_free_rings. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c1865fd2b3d4db22d1a1aac7dd06ea22cb990834.1732886067.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-23io_uring/memmap: implement mmap for regionsPavel Begunkov
The patch implements mmap for the param region and enables the kernel allocation mode. Internally it uses a fixed mmap offset, however the user has to use the offset returned in struct io_uring_region_desc::mmap_offset. Note, mmap doesn't and can't take ->uring_lock and the region / ring lookup is protected by ->mmap_lock, and it's directly peeking at ctx->param_region. We can't protect io_create_region() with the mmap_lock as it'd deadlock, which is why io_create_region_mmap_safe() initialises it for us in a temporary variable and then publishes it with the lock taken. It's intentionally decoupled from main region helpers, and in the future we might want to have a list of active regions, which then could be protected by the ->mmap_lock. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0f1212bd6af7fb39b63514b34fae8948014221d1.1732886067.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-23io_uring/memmap: implement kernel allocated regionsPavel Begunkov
Allow the kernel to allocate memory for a region. That's the classical way SQ/CQ are allocated. It's not yet useful to user space as there is no way to mmap it, which is why it's explicitly disabled in io_register_mem_region(). Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/7b8c40e6542546bbf93f4842a9a42a7373b81e0d.1732886067.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-23io_uring/memmap: add IO_REGION_F_SINGLE_REFPavel Begunkov
Kernel allocated compound pages will have just one reference for the entire page array, add a flag telling io_free_region about that. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/a7abfa7535e9728d5fcade29a1ea1605ec2c04ce.1732886067.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-23io_uring/memmap: helper for pinning region pagesPavel Begunkov
In preparation to adding kernel allocated regions extract a new helper that pins user pages. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/a17d7c39c3de4266b66b75b2dcf768150e1fc618.1732886067.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-23io_uring/memmap: optimise single folio regionsPavel Begunkov
We don't need to vmap if memory is already physically contiguous. There are two important cases it covers: PAGE_SIZE regions and huge pages. Use io_check_coalesce_buffer() to get the number of contiguous folios. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/d5240af23064a824c29d14d2406f1ae764bf4505.1732886067.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-23io_uring/memmap: reuse io_free_region for failure pathPavel Begunkov
Regions are going to become more complex with allocation options and optimisations, I want to split initialisation into steps and for that it needs a sane fail path. Reuse io_free_region(), it's smart enough to undo only what's needed and leaves the structure in a consistent state. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/b853b4ec407cc80d033d021bdd2c14e22378fc78.1732886067.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-23io_uring/memmap: account memory before pinningPavel Begunkov
Move memory accounting before page pinning. It shouldn't even try to pin pages if it's not allowed, and accounting is also relatively inexpensive. It also give a better code structure as we do generic accounting and then can branch for different mapping types. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/1e242b8038411a222e8b269d35e021fa5015289f.1732886067.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-23io_uring/memmap: flag regions with user pagesPavel Begunkov
In preparation to kernel allocated regions add a flag telling if the region contains user pinned pages or not. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/0dc91564642654405bab080b7ec911cb4a43ec6e.1732886067.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-23io_uring/memmap: flag vmap'ed regionsPavel Begunkov
Add internal flags for struct io_mapped_region. The first flag we need is IO_REGION_F_VMAPPED, that indicates that the pointer has to be unmapped on region destruction. For now all regions are vmap'ed, so it's set unconditionally. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/5a3d8046a038da97c0f8a8c8f1733fa3fc689d31.1732886067.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-23io_uring/rsrc: export io_check_coalesce_bufferPavel Begunkov
io_try_coalesce_buffer() is a useful helper collecting useful info about a set of pages, I want to reuse it for analysing ring/etc. mappings. I don't need the entire thing and only interested if it can be coalesced into a single page, but that's better than duplicating the parsing. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/353b447953cd5d34c454a7d909bb6024c391d6e2.1732886067.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-23io_uring: rename ->resize_lockPavel Begunkov
->resize_lock is used for resizing rings, but it's a good idea to reuse it in other cases as well. Rename it into mmap_lock as it's protects from races with mmap. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/68f705306f3ac4d2fb999eb80ea1615015ce9f7f.1732886067.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-24tracing/kprobe: Make trace_kprobe's module callback called after jump_label ↵Masami Hiramatsu (Google)
update Make sure the trace_kprobe's module notifer callback function is called after jump_label's callback is called. Since the trace_kprobe's callback eventually checks jump_label address during registering new kprobe on the loading module, jump_label must be updated before this registration happens. Link: https://lore.kernel.org/all/173387585556.995044.3157941002975446119.stgit@devnote2/ Fixes: 614243181050 ("tracing/kprobes: Support module init function probing") Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-12-23RDMA/hns: Fix missing flush CQE for DWQEChengchang Tang
Flush CQE handler has not been called if QP state gets into errored mode in DWQE path. So, the new added outstanding WQEs will never be flushed. It leads to a hung task timeout when using NFS over RDMA: __switch_to+0x7c/0xd0 __schedule+0x350/0x750 schedule+0x50/0xf0 schedule_timeout+0x2c8/0x340 wait_for_common+0xf4/0x2b0 wait_for_completion+0x20/0x40 __ib_drain_sq+0x140/0x1d0 [ib_core] ib_drain_sq+0x98/0xb0 [ib_core] rpcrdma_xprt_disconnect+0x68/0x270 [rpcrdma] xprt_rdma_close+0x20/0x60 [rpcrdma] xprt_autoclose+0x64/0x1cc [sunrpc] process_one_work+0x1d8/0x4e0 worker_thread+0x154/0x420 kthread+0x108/0x150 ret_from_fork+0x10/0x18 Fixes: 01584a5edcc4 ("RDMA/hns: Add support of direct wqe") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20241220055249.146943-5-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-12-23RDMA/hns: Fix warning storm caused by invalid input in IO pathChengchang Tang
WARN_ON() is called in the IO path. And it could lead to a warning storm. Use WARN_ON_ONCE() instead of WARN_ON(). Fixes: 12542f1de179 ("RDMA/hns: Refactor process about opcode in post_send()") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20241220055249.146943-4-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-12-23RDMA/hns: Fix accessing invalid dip_ctx during destroying QPChengchang Tang
If it fails to modify QP to RTR, dip_ctx will not be attached. And during detroying QP, the invalid dip_ctx pointer will be accessed. Fixes: faa62440a577 ("RDMA/hns: Fix different dgids mapping to the same dip_idx") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20241220055249.146943-3-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-12-23RDMA/hns: Fix mapping error of zero-hop WQE bufferwenglianfa
Due to HW limitation, the three region of WQE buffer must be mapped and set to HW in a fixed order: SQ buffer, SGE buffer, and RQ buffer. Currently when one region is zero-hop while the other two are not, the zero-hop region will not be mapped. This violate the limitation above and leads to address error. Fixes: 38389eaa4db1 ("RDMA/hns: Add mtr support for mixed multihop addressing") Signed-off-by: wenglianfa <wenglianfa@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20241220055249.146943-2-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-12-23Revert "drm/mediatek: dsi: Correct calculation formula of PHY Timing"Chun-Kuang Hu
This reverts commit 417d8c47271d5cf1a705e997065873b2a9a36fd4. With that patch the panel in the Tentacruel ASUS Chromebook CM14 (CM1402F) flickers. There are 1 or 2 times per second a black panel. Stable Kernel 6.11.5 and mainline 6.12-rc4 works only when reverse that patch. Fixes: 417d8c47271d ("drm/mediatek: dsi: Correct calculation formula of PHY Timing") Cc: stable@vger.kernel.org Cc: Shuijing Li <shuijing.li@mediatek.com> Reported-by: Jens Ziller <zillerbaer@gmx.de> Closes: https://patchwork.kernel.org/project/dri-devel/patch/20240412031208.30688-1-shuijing.li@mediatek.com/ Link: https://patchwork.kernel.org/project/dri-devel/patch/20241212001908.6056-1-chunkuang.hu@kernel.org/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2024-12-23cifs: Remove unused is_server_using_iface()Dr. David Alan Gilbert
The last use of is_server_using_iface() was removed in 2022 by commit aa45dadd34e4 ("cifs: change iface_list from array to sorted linked list") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-12-23smb: enable reuse of deferred file handles for write operationsBharath SM
Previously, deferred file handles were reused only for read operations, this commit extends to reusing deferred handles for write operations. By reusing these handles we can reduce the need for open/close operations over the wire. Signed-off-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-12-23Merge back earlier cpufreq material for 6.14Rafael J. Wysocki
2024-12-23udp: Deal with race between UDP socket address change and rehashStefano Brivio
If a UDP socket changes its local address while it's receiving datagrams, as a result of connect(), there is a period during which a lookup operation might fail to find it, after the address is changed but before the secondary hash (port and address) and the four-tuple hash (local and remote ports and addresses) are updated. Secondary hash chains were introduced by commit 30fff9231fad ("udp: bind() optimisation") and, as a result, a rehash operation became needed to make a bound socket reachable again after a connect(). This operation was introduced by commit 719f835853a9 ("udp: add rehash on connect()") which isn't however a complete fix: the socket will be found once the rehashing completes, but not while it's pending. This is noticeable with a socat(1) server in UDP4-LISTEN mode, and a client sending datagrams to it. After the server receives the first datagram (cf. _xioopen_ipdgram_listen()), it issues a connect() to the address of the sender, in order to set up a directed flow. Now, if the client, running on a different CPU thread, happens to send a (subsequent) datagram while the server's socket changes its address, but is not rehashed yet, this will result in a failed lookup and a port unreachable error delivered to the client, as apparent from the following reproducer: LEN=$(($(cat /proc/sys/net/core/wmem_default) / 4)) dd if=/dev/urandom bs=1 count=${LEN} of=tmp.in while :; do taskset -c 1 socat UDP4-LISTEN:1337,null-eof OPEN:tmp.out,create,trunc & sleep 0.1 || sleep 1 taskset -c 2 socat OPEN:tmp.in UDP4:localhost:1337,shut-null wait done where the client will eventually get ECONNREFUSED on a write() (typically the second or third one of a given iteration): 2024/11/13 21:28:23 socat[46901] E write(6, 0x556db2e3c000, 8192): Connection refused This issue was first observed as a seldom failure in Podman's tests checking UDP functionality while using pasta(1) to connect the container's network namespace, which leads us to a reproducer with the lookup error resulting in an ICMP packet on a tap device: LOCAL_ADDR="$(ip -j -4 addr show|jq -rM '.[] | .addr_info[0] | select(.scope == "global").local')" while :; do ./pasta --config-net -p pasta.pcap -u 1337 socat UDP4-LISTEN:1337,null-eof OPEN:tmp.out,create,trunc & sleep 0.2 || sleep 1 socat OPEN:tmp.in UDP4:${LOCAL_ADDR}:1337,shut-null wait cmp tmp.in tmp.out done Once this fails: tmp.in tmp.out differ: char 8193, line 29 we can finally have a look at what's going on: $ tshark -r pasta.pcap 1 0.000000 :: ? ff02::16 ICMPv6 110 Multicast Listener Report Message v2 2 0.168690 88.198.0.161 ? 88.198.0.164 UDP 8234 60260 ? 1337 Len=8192 3 0.168767 88.198.0.161 ? 88.198.0.164 UDP 8234 60260 ? 1337 Len=8192 4 0.168806 88.198.0.161 ? 88.198.0.164 UDP 8234 60260 ? 1337 Len=8192 5 0.168827 c6:47:05:8d:dc:04 ? Broadcast ARP 42 Who has 88.198.0.161? Tell 88.198.0.164 6 0.168851 9a:55:9a:55:9a:55 ? c6:47:05:8d:dc:04 ARP 42 88.198.0.161 is at 9a:55:9a:55:9a:55 7 0.168875 88.198.0.161 ? 88.198.0.164 UDP 8234 60260 ? 1337 Len=8192 8 0.168896 88.198.0.164 ? 88.198.0.161 ICMP 590 Destination unreachable (Port unreachable) 9 0.168926 88.198.0.161 ? 88.198.0.164 UDP 8234 60260 ? 1337 Len=8192 10 0.168959 88.198.0.161 ? 88.198.0.164 UDP 8234 60260 ? 1337 Len=8192 11 0.168989 88.198.0.161 ? 88.198.0.164 UDP 4138 60260 ? 1337 Len=4096 12 0.169010 88.198.0.161 ? 88.198.0.164 UDP 42 60260 ? 1337 Len=0 On the third datagram received, the network namespace of the container initiates an ARP lookup to deliver the ICMP message. In another variant of this reproducer, starting the client with: strace -f pasta --config-net -u 1337 socat UDP4-LISTEN:1337,null-eof OPEN:tmp.out,create,trunc 2>strace.log & and connecting to the socat server using a loopback address: socat OPEN:tmp.in UDP4:localhost:1337,shut-null we can more clearly observe a sendmmsg() call failing after the first datagram is delivered: [pid 278012] connect(173, 0x7fff96c95fc0, 16) = 0 [...] [pid 278012] recvmmsg(173, 0x7fff96c96020, 1024, MSG_DONTWAIT, NULL) = -1 EAGAIN (Resource temporarily unavailable) [pid 278012] sendmmsg(173, 0x561c5ad0a720, 1, MSG_NOSIGNAL) = 1 [...] [pid 278012] sendmmsg(173, 0x561c5ad0a720, 1, MSG_NOSIGNAL) = -1 ECONNREFUSED (Connection refused) and, somewhat confusingly, after a connect() on the same socket succeeded. Until commit 4cdeeee9252a ("net: udp: prefer listeners bound to an address"), the race between receive address change and lookup didn't actually cause visible issues, because, once the lookup based on the secondary hash chain failed, we would still attempt a lookup based on the primary hash (destination port only), and find the socket with the outdated secondary hash. That change, however, dropped port-only lookups altogether, as side effect, making the race visible. To fix this, while avoiding the need to make address changes and rehash atomic against lookups, reintroduce primary hash lookups as fallback, if lookups based on four-tuple and secondary hashes fail. To this end, introduce a simplified lookup implementation, which doesn't take care of SO_REUSEPORT groups: if we have one, there are multiple sockets that would match the four-tuple or secondary hash, meaning that we can't run into this race at all. v2: - instead of synchronising lookup operations against address change plus rehash, reintroduce a simplified version of the original primary hash lookup as fallback v1: - fix build with CONFIG_IPV6=n: add ifdef around sk_v6_rcv_saddr usage (Kuniyuki Iwashima) - directly use sk_rcv_saddr for IPv4 receive addresses instead of fetching inet_rcv_saddr (Kuniyuki Iwashima) - move inet_update_saddr() to inet_hashtables.h and use that to set IPv4/IPv6 addresses as suitable (Kuniyuki Iwashima) - rebase onto net-next, update commit message accordingly Reported-by: Ed Santiago <santiago@redhat.com> Link: https://github.com/containers/podman/issues/24147 Analysed-by: David Gibson <david@gibson.dropbear.id.au> Fixes: 30fff9231fad ("udp: bind() optimisation") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-23can: dev: can_get_state_str(): Remove dead codeAriel Otilibili
The default switch case ends with a return; meaning this return is never reached. Coverity-ID: 1497123 Signed-off-by: Ariel Otilibili <ariel.otilibili-anieli@eurecom.fr> Link: https://patch.msgid.link/20241221111454.1074285-4-ariel.otilibili-anieli@eurecom.fr Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-12-23MAINTAINERS: assign em_canid.c additionally to CAN maintainersOliver Hartkopp
The extended match rule em_canid is used to classify CAN frames based on their CAN Identifier. To keep the CAN maintainers in the loop for relevant changes which might affect the CAN specific functionality add em_canid.c to the CAN NETWORK LAYER files. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://patch.msgid.link/20241219190837.3087-1-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-12-23mailmap: add an entry for Oliver HartkoppOliver Hartkopp
Map my retired company address and an accidentally used personal mail address within mailmap. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Link: https://patch.msgid.link/20241130170911.2828-1-socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-12-23dt-bindings: net: can: atmel: Convert to json schemaCharan Pedumuru
Convert old text based binding to json schema. Changes during conversion: - Add a fallback for `microchip,sam9x60-can` as it is compatible with the CAN IP core on `atmel,at91sam9x5-can`. - Add the required properties `clock` and `clock-names`, which were missing in the original binding. - Update examples and include appropriate file directives to resolve errors identified by `dt_binding_check` and `dtbs_check`. Signed-off-by: Charan Pedumuru <charan.pedumuru@microchip.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/20241120-can-v3-1-da5bb4f6128d@microchip.com [mkl: fixed indention in example] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-12-23can: tcan4x5x: get rid of false clock errorsSean Nyekjaer
tcan4x5x devices only requires the clock "cclk", so call devm_clk_get() directly. This is done to avoid m_can_class_get_clocks() that checks for both hclk and cclk and results in this warning message: | tcan4x5x spi0.0: no clock found Signed-off-by: Sean Nyekjaer <sean@geanix.com> Link: https://patch.msgid.link/20241128-mcancclk-v1-1-a93aac64dbae@geanix.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-12-23can: sun4i_can: continue to use likely() to check skbDario Binacchi
Throughout the sun4i_can_err() function, the likely() macro is used to check the skb buffer, except in one instance. This patch makes the code consistent by using the macro in that case as well. Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com> Link: https://patch.msgid.link/20241122221650.633981-4-dario.binacchi@amarulasolutions.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-12-23Merge patch series "can: tcan4x5x: add option for selecting nWKRQ voltage"Marc Kleine-Budde
Sean Nyekjaer <sean@geanix.com> says: This series adds support for setting the nWKRQ voltage. Link: https://patch.msgid.link/20241114-tcan-wkrqv-v5-0-a2d50833ed71@geanix.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-12-23can: tcan4x5x: add option for selecting nWKRQ voltageSean Nyekjaer
The nWKRQ pin supports an output voltage of either the internal reference voltage (3.6V) or the reference voltage of the digital interface 0-6V (VIO). Add the devicetree option ti,nwkrq-voltage-vio to set it to VIO. If this property is omitted the reset default, the internal reference voltage, is used. Signed-off-by: Sean Nyekjaer <sean@geanix.com> Reviewed-by: Marc Kleine-Budde <mkl@pengutronix.de> Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://patch.msgid.link/20241114-tcan-wkrqv-v5-2-a2d50833ed71@geanix.com [mkl: remove unused variable in tcan4x5x_get_dt_data()] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-12-23OPP: fix dev_pm_opp_find_bw_*() when bandwidth table not initializedNeil Armstrong
If a driver calls dev_pm_opp_find_bw_ceil/floor() the retrieve bandwidth from the OPP table but the bandwidth table was not created because the interconnect properties were missing in the OPP consumer node, the kernel will crash with: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000004 ... pc : _read_bw+0x8/0x10 lr : _opp_table_find_key+0x9c/0x174 ... Call trace: _read_bw+0x8/0x10 (P) _opp_table_find_key+0x9c/0x174 (L) _find_key+0x98/0x168 dev_pm_opp_find_bw_ceil+0x50/0x88 ... In order to fix the crash, create an assert function to check if the bandwidth table was created before trying to get a bandwidth with _read_bw(). Fixes: add1dc094a74 ("OPP: Use generic key finding helpers for bandwidth key") Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23OPP: add index check to assert to avoid buffer overflow in _read_freq()Neil Armstrong
Pass the freq index to the assert function to make sure we do not read a freq out of the opp->rates[] table when called from the indexed variants: dev_pm_opp_find_freq_exact_indexed() or dev_pm_opp_find_freq_ceil/floor_indexed(). Add a secondary parameter to the assert function, unused for assert_single_clk() then add assert_clk_index() which will check for the clock index when called from the _indexed() find functions. Fixes: 142e17c1c2b4 ("OPP: Introduce dev_pm_opp_find_freq_{ceil/floor}_indexed() APIs") Fixes: a5893928bb17 ("OPP: Add dev_pm_opp_find_freq_exact_indexed()") Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23opp: core: Fix off by one in dev_pm_opp_get_bw()Dan Carpenter
The "opp->bandwidth" array has "opp->opp_table->path_count" number of elements. It's allocated in _opp_allocate(). So this > needs to be >= to prevent an out of bounds access. Fixes: d78653dcd8bf ("opp: core: implement dev_pm_opp_get_bw") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23opp: core: implement dev_pm_opp_get_bwNeil Armstrong
Add and implement dev_pm_opp_get_bw() to retrieve the OPP's bandwidth in the same way as the dev_pm_opp_get_voltage() helper. Retrieving bandwidth is required in the case of the Adreno GPU where the GPU Management Unit can handle the Bandwidth scaling. The helper can get the peak or average bandwidth for any of the interconnect path. Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> [ Viresh: Fixed commit log and a comment in code ] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23cpufreq: qcom: Implement clk_ops::determine_rate() for qcom_cpufreq* clocksManivannan Sadhasivam
determine_rate() callback is used by the clk_set_rate() API to get the closest rate of the target rate supported by the clock. If this callback is not implemented (nor round_rate() callback), then the API will assume that the clock cannot set the requested rate. And since there is no parent, it will return -EINVAL. This is not an issue right now as clk_set_rate() mistakenly compares the target rate with cached rate and bails out early. But once that is fixed to compare the target rate with the actual rate of the clock (returned by recalc_rate()), then clk_set_rate() for this clock will start to fail as below: cpu cpu0: _opp_config_clk_single: failed to set clock rate: -22 So implement the determine_rate() callback that just returns the actual rate at which the clock is passed to the CPUs in a domain. Fixes: 4370232c727b ("cpufreq: qcom-hw: Add CPU clock provider support") Reported-by: Johan Hovold <johan+linaro@kernel.org> Suggested-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23cpufreq: qcom: Fix qcom_cpufreq_hw_recalc_rate() to query LUT if LMh IRQ is ↵Manivannan Sadhasivam
not available Currently, qcom_cpufreq_hw_recalc_rate() returns the LMh throttled frequency for the domain even if LMh IRQ is not available. But as per qcom_cpufreq_hw_get(), the driver has to query LUT entries to get the actual frequency of the domain. So do the same in qcom_cpufreq_hw_recalc_rate(). While doing so, refactor the existing qcom_cpufreq_hw_get() function so that qcom_cpufreq_hw_recalc_rate() can make use of the existing code and avoid code duplication. This also requires setting the qcom_cpufreq_data::policy even if LMh IRQ is not available. Fixes: 4370232c727b ("cpufreq: qcom-hw: Add CPU clock provider support") Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23cpufreq: apple-soc: Add Apple A7-A8X SoC cpufreq supportNick Chan
These SoCs only use 3 bits for p-states, and have a different APPLE_DVFS_CMD_PS1 mask value. Signed-off-by: Nick Chan <towinchenmi@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23cpufreq: apple-soc: Set fallback transition latency to ↵Nick Chan
APPLE_DVFS_TRANSITION_TIMEOUT The driver already assumes transitions will not take longer than APPLE_DVFS_TRANSITION_TIMEOUT in apple_soc_cpufreq_set_target(), so it makes little sense to set CPUFREQ_ETERNAL as the transition latency when the transistion latency is not given by the opp-table. Reviewed-by: Christian Loehle <christian.loehle@arm.com> Signed-off-by: Nick Chan <towinchenmi@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23cpufreq: apple-soc: Increase cluster switch timeout to 400usNick Chan
Apple A11 SoC takes a long time to switch. Maximum switch time observed is 345us, so increase the cluster switch timeout to 400us to be safe. Signed-off-by: Nick Chan <towinchenmi@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23cpufreq: apple-soc: Use 32-bit read for status registerNick Chan
Apple A7-A9(X) SoCs requires 32-bit reads on the status register. Newer SoCs accepts 32-bit reads on the status register as well. Signed-off-by: Nick Chan <towinchenmi@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23cpufreq: apple-soc: Allow per-SoC configuration of APPLE_DVFS_CMD_PS1Nick Chan
Support for SoC that has a different APPLE_DVFS_CMD_PS1 will be added soon, so modify the driver first to allow it to be configured per-SoC. Signed-off-by: Nick Chan <towinchenmi@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23cpufreq: apple-soc: Drop setting the PS2 field on M2+Hector Martin
Newer device do not use this. It is not known what this field does, but change the behavior to be same as macOS to be safe. Signed-off-by: Hector Martin <marcan@marcan.st> Signed-off-by: Nick Chan <towinchenmi@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23dt-bindings: cpufreq: apple,cluster-cpufreq: Add A7-A11, T2 compatiblesNick Chan
Add compatibles for Apple A7-A11, T2 SoCs. Apple A7, A8, A8X gets the per-SoC compatible and the A7 "apple,s5l8960x-cluster-cpufreq" compatible. Apple A9, A9X, A10, A10X, T2, A11 gets the per-SoC compatible, M1 "apple,t8103-cluster-cpufreq" compatible, then the "apple,cluster-cpufreq" fallback compatible. Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Nick Chan <towinchenmi@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23dt-bindings: cpufreq: Document support for Airoha EN7581 CPUFreqChristian Marangi
On newer Airoha SoC, CPU Frequency is scaled indirectly with SMC commands to ATF. A virtual clock is exposed. This virtual clock is a get-only clock and is used to expose the current global CPU clock. The frequency info comes by the output of the SMC command that reports the clock in MHz. The SMC sets the CPU clock by providing an index, this is modelled as performance states in a power domain. CPUs can't be individually scaled as the CPU frequency is shared across all CPUs and is global. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23cpufreq: fix using cpufreq-dt as moduleAndreas Kemnade
This driver can be built as a module since commit 3b062a086984 ("cpufreq: dt-platdev: Support building as module"), but unfortunately this caused a regression because the cputfreq-dt-platdev.ko module does not autoload. Usually, this is solved by just using the MODULE_DEVICE_TABLE() macro to export all the device IDs as module aliases. But this driver is special due how matches with devices and decides what platform supports. There are two of_device_id lists, an allow list that are for CPU devices that always match and a deny list that's for devices that must not match. The driver registers a cpufreq-dt platform device for all the CPU device nodes that either are in the allow list or contain an operating-points-v2 property and are not in the deny list. Enforce builtin compile of cpufreq-dt-platdev to make autoload work. Fixes: 3b062a086984 ("cpufreq: dt-platdev: Support building as module") Link: https://lore.kernel.org/all/20241104201424.2a42efdd@akair/ Link: https://lore.kernel.org/all/20241119111918.1732531-1-javierm@redhat.com/ Cc: stable@vger.kernel.org Signed-off-by: Andreas Kemnade <andreas@kemnade.info> Reported-by: Radu Rendec <rrendec@redhat.com> Reported-by: Javier Martinez Canillas <javierm@redhat.com> [ Viresh: Picked commit log from Javier, updated tags ] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23cpufreq: scmi: Register for limit change notificationsSibi Sankar
Register for limit change notifications if supported and use the throttled frequency from the notification to apply HW pressure. Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com> Tested-by: Mike Tipton <quic_mdtipton@quicinc.com> Reviewed-by: Cristian Marussi <cristian.marussi@arm.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2024-12-23leds: Add LED1202 I2C driverVicentiu Galanopulo
The output current can be adjusted separately for each channel by 8-bit analog (current sink input) and 12-bit digital (PWM) dimming control. The LED1202 implements 12 low-side current generators with independent dimming control. Internal volatile memory allows the user to store up to 8 different patterns, each pattern is a particular output configuration in terms of PWM duty-cycle (on 4096 steps). Analog dimming (on 256 steps) is per channel but common to all patterns. Each device tree LED node will have a corresponding entry in /sys/class/leds with the label name. The brightness property corresponds to the per channel analog dimming, while the patterns[1-8] to the PWM dimming control. Signed-off-by: Vicentiu Galanopulo <vicentiu.galanopulo@remote-tech.co.uk> Link: https://lore.kernel.org/r/20241218182001.41476-4-vicentiu.galanopulo@remote-tech.co.uk Signed-off-by: Lee Jones <lee@kernel.org>
2024-12-23dt-bindings: leds: Add LED1202 LED ControllerVicentiu Galanopulo
The LED1202 is a 12-channel low quiescent current LED driver with: * Supply range from 2.6 V to 5 V * 20 mA current capability per channel * 1.8 V compatible I2C control interface * 8-bit analog dimming individual control * 12-bit local PWM resolution * 8 programmable patterns If the led node is present in the controller then the channel is set to active. Signed-off-by: Vicentiu Galanopulo <vicentiu.galanopulo@remote-tech.co.uk> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20241218182001.41476-3-vicentiu.galanopulo@remote-tech.co.uk Signed-off-by: Lee Jones <lee@kernel.org>
2024-12-23Documentation:leds: Add leds-st1202.rstVicentiu Galanopulo
Add usage for sysfs hw_pattern entry for leds-st1202 Signed-off-by: Vicentiu Galanopulo <vicentiu.galanopulo@remote-tech.co.uk> Link: https://lore.kernel.org/r/20241218182001.41476-2-vicentiu.galanopulo@remote-tech.co.uk Signed-off-by: Lee Jones <lee@kernel.org>