summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2024-08-23net: sfp: pass the phy_device when disconnecting an sfp module's PHYMaxime Chevallier
Pass the phy_device as a parameter to the sfp upstream .disconnect_phy operation. This is preparatory work to help track phy devices across a net_device's link. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-23net: phy: Introduce ethernet link topology representationMaxime Chevallier
Link topologies containing multiple network PHYs attached to the same net_device can be found when using a PHY as a media converter for use with an SFP connector, on which an SFP transceiver containing a PHY can be used. With the current model, the transceiver's PHY can't be used for operations such as cable testing, timestamping, macsec offload, etc. The reason being that most of the logic for these configuration, coming from either ethtool netlink or ioctls tend to use netdev->phydev, which in multi-phy systems will reference the PHY closest to the MAC. Introduce a numbering scheme allowing to enumerate PHY devices that belong to any netdev, which can in turn allow userspace to take more precise decisions with regard to each PHY's configuration. The numbering is maintained per-netdev, in a phy_device_list. The numbering works similarly to a netdevice's ifindex, with identifiers that are only recycled once INT_MAX has been reached. This prevents races that could occur between PHY listing and SFP transceiver removal/insertion. The identifiers are assigned at phy_attach time, as the numbering depends on the netdevice the phy is attached to. The PHY index can be re-used for PHYs that are persistent. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-23filemap: allocate mapping_min_order folios in the page cachePankaj Raghav
filemap_create_folio() and do_read_cache_folio() were always allocating folio of order 0. __filemap_get_folio was trying to allocate higher order folios when fgp_flags had higher order hint set but it will default to order 0 folio if higher order memory allocation fails. Supporting mapping_min_order implies that we guarantee each folio in the page cache has at least an order of mapping_min_order. When adding new folios to the page cache we must also ensure the index used is aligned to the mapping_min_order as the page cache requires the index to be aligned to the order of the folio. Co-developed-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> Link: https://lore.kernel.org/r/20240822135018.1931258-3-kernel@pankajraghav.com Tested-by: David Howells <dhowells@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Daniel Gomez <da.gomez@samsung.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-23fs: Allow fine-grained control of folio sizesMatthew Wilcox (Oracle)
We need filesystems to be able to communicate acceptable folio sizes to the pagecache for a variety of uses (e.g. large block sizes). Support a range of folio sizes between order-0 and order-31. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Co-developed-by: Pankaj Raghav <p.raghav@samsung.com> Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> Link: https://lore.kernel.org/r/20240822135018.1931258-2-kernel@pankajraghav.com Tested-by: David Howells <dhowells@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Daniel Gomez <da.gomez@samsung.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-23iommu: Handle iommu faults for a bad iopf setupPranjal Shrivastava
The iommu_report_device_fault function was updated to return void while assuming that drivers only need to call iommu_report_device_fault() for reporting an iopf. This implementation causes following problems: 1. The drivers rely on the core code to call it's page_reponse, however, when a fault is received and no fault capable domain is attached / iopf_param is NULL, the ops->page_response is NOT called causing the device to stall in case the fault type was PAGE_REQ. 2. The arm_smmu_v3 driver relies on the returned value to log errors returning void from iommu_report_device_fault causes these events to be missed while logging. Modify the iommu_report_device_fault function to return -EINVAL for cases where no fault capable domain is attached or iopf_param was NULL and calls back to the driver (ops->page_response) in case the fault type was IOMMU_FAULT_PAGE_REQ. The returned value can be used by the drivers to log the fault/event as needed. Reported-by: Kunkun Jiang <jiangkunkun@huawei.com> Closes: https://lore.kernel.org/all/6147caf0-b9a0-30ca-795e-a1aa502a5c51@huawei.com/ Fixes: 3dfa64aecbaf ("iommu: Make iommu_report_device_fault() return void") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Pranjal Shrivastava <praan@google.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20240816104906.1010626-1-praan@google.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-08-23dt-bindings: clock: add ExynosAuto v920 SoC CMU bindingsSunyeal Hong
Add dt-schema for ExynosAuto v920 SoC clock controller. Add device tree clock binding definitions for below CMU blocks. - CMU_TOP - CMU_PERIC0/1 - CMU_MISC - CMU_HSI0/1 Signed-off-by: Sunyeal Hong <sunyeal.hong@samsung.com> Link: https://lore.kernel.org/r/20240821232652.1077701-2-sunyeal.hong@samsung.com Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
2024-08-23xfrm: Correct spelling in xfrm.hSimon Horman
Correct spelling in xfrm.h. As reported by codespell. Signed-off-by: Simon Horman <horms@kernel.org> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2024-08-22Merge patch series "Simplify multiple create*_workqueue() invocations"Martin K. Petersen
Bart Van Assche <bvanassche@acm.org> says: Hi Martin, Multiple SCSI drivers use snprintf() to format a workqueue name before invoking one of the create*_workqueue() macros. This patch series simplifies such code by passing the format string and arguments to alloc_workqueue(). Additionally, the structure members that are only used as a temporary buffer for formatting workqueue names are removed. Please consider this patch series for the next merge window. Thanks, Bart. Link: https://lore.kernel.org/r/20240822195944.654691-1-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-22scsi: core: Simplify an alloc_workqueue() invocationBart Van Assche
Let alloc_workqueue() format the workqueue name. Remove the work_q_name[] member from struct Scsi_Host because it is no longer used by any SCSI driver nor by the SCSI core. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240822195944.654691-19-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-22scsi: scsi_transport_fc: Simplify alloc_workqueue() invocationsBart Van Assche
Let alloc_workqueue() format the workqueue name instead of calling snprintf() explicitly. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240822195944.654691-16-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-22scsi: fcoe: Simplify alloc_ordered_workqueue() invocationsBart Van Assche
Let alloc_ordered_workqueue() format the workqueue name instead of calling snprintf() explicitly. Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240822195944.654691-7-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-22scsi: ufs: Move UFS trace events to private headerAvri Altman
UFS trace events are called exclusively from the UFS core drivers. Make those events private to the core driver. The MAINTAINERS file does not need updating as the maintainership remains the same and the relevant directory is already covered. Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Avri Altman <avri.altman@wdc.com> Link: https://lore.kernel.org/r/20240821055411.3128159-1-avri.altman@wdc.com Acked-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: drivers/net/ethernet/broadcom/bnxt/bnxt.h c948c0973df5 ("bnxt_en: Don't clear ntuple filters and rss contexts during ethtool ops") f2878cdeb754 ("bnxt_en: Add support to call FW to update a VNIC") Link: https://patch.msgid.link/20240822210125.1542769-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-22string: Check for "nonstring" attribute on strscpy() argumentsKees Cook
GCC already checks for arguments that are marked with the "nonstring"[1] attribute when used on standard C String API functions (e.g. strcpy). Gain this compile-time checking also for the kernel's primary string copying function, strscpy(). Note that Clang has neither "nonstring" nor __builtin_has_attribute(). Link: https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#index-nonstring-variable-attribute [1] Reviewed-by: Miguel Ojeda <ojeda@kernel.org> Tested-by: Miguel Ojeda <ojeda@kernel.org> Link: https://lore.kernel.org/r/20240805214340.work.339-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2024-08-22virt: vbox: struct vmmdev_hgcm_pagelist: Replace 1-element array with ↵Kees Cook
flexible array Replace the deprecated[1] use of a 1-element array in struct vmmdev_hgcm_pagelist with a modern flexible array. As this is UAPI, we cannot trivially change the size of the struct, so use a union to retain the old first element's size, but switch "pages" to a flexible array. No binary differences are present after this conversion. Link: https://github.com/KSPP/linux/issues/79 [1] Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20240710231555.work.406-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2024-08-23Merge tag 'net-6.11-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth and netfilter. Current release - regressions: - virtio_net: avoid crash on resume - move netdev_tx_reset_queue() call before RX napi enable Current release - new code bugs: - net/mlx5e: fix page leak and incorrect header release w/ HW GRO Previous releases - regressions: - udp: fix receiving fraglist GSO packets - tcp: prevent refcount underflow due to concurrent execution of tcp_sk_exit_batch() Previous releases - always broken: - ipv6: fix possible UAF when incrementing error counters on output - ip6: tunnel: prevent merging of packets with different L2 - mptcp: pm: fix IDs not being reusable - bonding: fix potential crashes in IPsec offload handling - Bluetooth: HCI: - MGMT: add error handling to pair_device() to avoid a crash - invert LE State quirk to be opt-out rather then opt-in - fix LE quote calculation - drv: dsa: VLAN fixes for Ocelot driver - drv: igb: cope with large MAX_SKB_FRAGS Kconfig settings - drv: ice: fi Rx data path on architectures with PAGE_SIZE >= 8192 Misc: - netpoll: do not export netpoll_poll_[disable|enable]() - MAINTAINERS: update the list of networking headers" * tag 'net-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (82 commits) s390/iucv: Fix vargs handling in iucv_alloc_device() net: ovs: fix ovs_drop_reasons error net: xilinx: axienet: Fix dangling multicast addresses net: xilinx: axienet: Always disable promiscuous mode MAINTAINERS: Mark JME Network Driver as Odd Fixes MAINTAINERS: Add header files to NETWORKING sections MAINTAINERS: Add limited globs for Networking headers MAINTAINERS: Add net_tstamp.h to SOCKET TIMESTAMPING section MAINTAINERS: Add sonet.h to ATM section of MAINTAINERS octeontx2-af: Fix CPT AF register offset calculation net: phy: realtek: Fix setting of PHY LEDs Mode B bit on RTL8211F net: ngbe: Fix phy mode set to external phy netfilter: flowtable: validate vlan header bnxt_en: Fix double DMA unmapping for XDP_REDIRECT ipv6: prevent possible UAF in ip6_xmit() ipv6: fix possible UAF in ip6_finish_output2() ipv6: prevent UAF in ip6_send_skb() netpoll: do not export netpoll_poll_[disable|enable]() selftests: mlxsw: ethtool_lanes: Source ethtool lib from correct path udp: fix receiving fraglist GSO packets ...
2024-08-22nvme-tcp: check for invalidated or revoked keyHannes Reinecke
key_lookup() will always return a key, even if that key is revoked or invalidated. So check for invalid keys before continuing. Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2024-08-22ASoC: grace time for DPCM cleanupMark Brown
Merge series from Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>: As we discussed in [1], we don't need to use dpcm_playback/capture flag, so we remove it. But we have been using it for 10 years, some driver might get damage. The most likely case is that the device/driver can use both playback/capture, but have only one flag, and not using xxx_only flag. [1/3] patch indicates warning in such case. These adds grace time for DPCM cleanup. I'm not sure when dpcm_xxx will be removed, and Codec check bypass will be error, but maybe v6.12 or v6.13 ? Please check each driver by that time. Previous patch-set try to check both CPU and Codec in DPCM, but we noticed that there are some special DAI which we can't handle today [2]. So I will escape it in this patch-set. [1] https://lore.kernel.org/r/87edaym2cg.wl-kuninori.morimoto.gx@renesas.com [2] https://lore.kernel.org/all/3e67d62d-fe08-4f55-ab5b-ece8a57154f9@linux.intel.com/ Link: https://lore.kernel.org/r/87edaym2cg.wl-kuninori.morimoto.gx@renesas.com Link: https://lore.kernel.org/r/87wmo6dyxg.wl-kuninori.morimoto.gx@renesas.com Link: https://lore.kernel.org/r/87msole5wc.wl-kuninori.morimoto.gx@renesas.com Link: https://lore.kernel.org/r/871q5tnuok.wl-kuninori.morimoto.gx@renesas.com Link: https://lore.kernel.org/r/87bk4oqerx.wl-kuninori.morimoto.gx@renesas.com Link: https://lore.kernel.org/r/8734pctmte.wl-kuninori.morimoto.gx@renesas.com Link: https://lore.kernel.org/r/87r0ctwzr4.wl-kuninori.morimoto.gx@renesas.com Link: https://lore.kernel.org/r/87cymvlmki.wl-kuninori.morimoto.gx@renesas.com
2024-08-22Input: keypad-nomadik-ske - remove the driverDmitry Torokhov
The users of this driver were removed in 2013 in commit 28633c54bda6 ("ARM: ux500: Rip out keypad initialisation which is no longer used"). Remove the driver as well. Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/Zr-gX0dfN4te_8VG@google.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2024-08-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfAlexei Starovoitov
Cross-merge bpf fixes after downstream PR including important fixes (from bpf-next point of view): commit 41c24102af7b ("selftests/bpf: Filter out _GNU_SOURCE when compiling test_cpp") commit fdad456cbcca ("bpf: Fix updating attached freplace prog in prog_array map") No conflicts. Adjacent changes in: include/linux/bpf_verifier.h kernel/bpf/verifier.c tools/testing/selftests/bpf/Makefile Link: https://lore.kernel.org/bpf/20240813234307.82773-1-alexei.starovoitov@gmail.com/ Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-08-22lsm: replace indirect LSM hook calls with static callsKP Singh
LSM hooks are currently invoked from a linked list as indirect calls which are invoked using retpolines as a mitigation for speculative attacks (Branch History / Target injection) and add extra overhead which is especially bad in kernel hot paths: security_file_ioctl: 0xff...0320 <+0>: endbr64 0xff...0324 <+4>: push %rbp 0xff...0325 <+5>: push %r15 0xff...0327 <+7>: push %r14 0xff...0329 <+9>: push %rbx 0xff...032a <+10>: mov %rdx,%rbx 0xff...032d <+13>: mov %esi,%ebp 0xff...032f <+15>: mov %rdi,%r14 0xff...0332 <+18>: mov $0xff...7030,%r15 0xff...0339 <+25>: mov (%r15),%r15 0xff...033c <+28>: test %r15,%r15 0xff...033f <+31>: je 0xff...0358 <security_file_ioctl+56> 0xff...0341 <+33>: mov 0x18(%r15),%r11 0xff...0345 <+37>: mov %r14,%rdi 0xff...0348 <+40>: mov %ebp,%esi 0xff...034a <+42>: mov %rbx,%rdx 0xff...034d <+45>: call 0xff...2e0 <__x86_indirect_thunk_array+352> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Indirect calls that use retpolines leading to overhead, not just due to extra instruction but also branch misses. 0xff...0352 <+50>: test %eax,%eax 0xff...0354 <+52>: je 0xff...0339 <security_file_ioctl+25> 0xff...0356 <+54>: jmp 0xff...035a <security_file_ioctl+58> 0xff...0358 <+56>: xor %eax,%eax 0xff...035a <+58>: pop %rbx 0xff...035b <+59>: pop %r14 0xff...035d <+61>: pop %r15 0xff...035f <+63>: pop %rbp 0xff...0360 <+64>: jmp 0xff...47c4 <__x86_return_thunk> The indirect calls are not really needed as one knows the addresses of enabled LSM callbacks at boot time and only the order can possibly change at boot time with the lsm= kernel command line parameter. An array of static calls is defined per LSM hook and the static calls are updated at boot time once the order has been determined. With the hook now exposed as a static call, one can see that the retpolines are no longer there and the LSM callbacks are invoked directly: security_file_ioctl: 0xff...0ca0 <+0>: endbr64 0xff...0ca4 <+4>: nopl 0x0(%rax,%rax,1) 0xff...0ca9 <+9>: push %rbp 0xff...0caa <+10>: push %r14 0xff...0cac <+12>: push %rbx 0xff...0cad <+13>: mov %rdx,%rbx 0xff...0cb0 <+16>: mov %esi,%ebp 0xff...0cb2 <+18>: mov %rdi,%r14 0xff...0cb5 <+21>: jmp 0xff...0cc7 <security_file_ioctl+39> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Static key enabled for SELinux 0xffffffff818f0cb7 <+23>: jmp 0xff...0cde <security_file_ioctl+62> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Static key enabled for BPF LSM. This is something that is changed to default to false to avoid the existing side effect issues of BPF LSM [1] in a subsequent patch. 0xff...0cb9 <+25>: xor %eax,%eax 0xff...0cbb <+27>: xchg %ax,%ax 0xff...0cbd <+29>: pop %rbx 0xff...0cbe <+30>: pop %r14 0xff...0cc0 <+32>: pop %rbp 0xff...0cc1 <+33>: cs jmp 0xff...0000 <__x86_return_thunk> 0xff...0cc7 <+39>: endbr64 0xff...0ccb <+43>: mov %r14,%rdi 0xff...0cce <+46>: mov %ebp,%esi 0xff...0cd0 <+48>: mov %rbx,%rdx 0xff...0cd3 <+51>: call 0xff...3230 <selinux_file_ioctl> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Direct call to SELinux. 0xff...0cd8 <+56>: test %eax,%eax 0xff...0cda <+58>: jne 0xff...0cbd <security_file_ioctl+29> 0xff...0cdc <+60>: jmp 0xff...0cb7 <security_file_ioctl+23> 0xff...0cde <+62>: endbr64 0xff...0ce2 <+66>: mov %r14,%rdi 0xff...0ce5 <+69>: mov %ebp,%esi 0xff...0ce7 <+71>: mov %rbx,%rdx 0xff...0cea <+74>: call 0xff...e220 <bpf_lsm_file_ioctl> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Direct call to BPF LSM. 0xff...0cef <+79>: test %eax,%eax 0xff...0cf1 <+81>: jne 0xff...0cbd <security_file_ioctl+29> 0xff...0cf3 <+83>: jmp 0xff...0cb9 <security_file_ioctl+25> 0xff...0cf5 <+85>: endbr64 0xff...0cf9 <+89>: mov %r14,%rdi 0xff...0cfc <+92>: mov %ebp,%esi 0xff...0cfe <+94>: mov %rbx,%rdx 0xff...0d01 <+97>: pop %rbx 0xff...0d02 <+98>: pop %r14 0xff...0d04 <+100>: pop %rbp 0xff...0d05 <+101>: ret 0xff...0d06 <+102>: int3 0xff...0d07 <+103>: int3 0xff...0d08 <+104>: int3 0xff...0d09 <+105>: int3 While this patch uses static_branch_unlikely indicating that an LSM hook is likely to be not present. In most cases this is still a better choice as even when an LSM with one hook is added, empty slots are created for all LSM hooks (especially when many LSMs that do not initialize most hooks are present on the system). There are some hooks that don't use the call_int_hook or call_void_hook. These hooks are updated to use a new macro called lsm_for_each_hook where the lsm_callback is directly invoked as an indirect call. Below are results of the relevant Unixbench system benchmarks with BPF LSM and SELinux enabled with default policies enabled with and without these patches. Benchmark Delta(%): (+ is better) ========================================================================== Execl Throughput +1.9356 File Write 1024 bufsize 2000 maxblocks +6.5953 Pipe Throughput +9.5499 Pipe-based Context Switching +3.0209 Process Creation +2.3246 Shell Scripts (1 concurrent) +1.4975 System Call Overhead +2.7815 System Benchmarks Index Score (Partial Only): +3.4859 In the best case, some syscalls like eventfd_create benefitted to about ~10%. Tested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Casey Schaufler <casey@schaufler-ca.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Song Liu <song@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: KP Singh <kpsingh@kernel.org> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-22lsm: count the LSMs enabled at compile timeKP Singh
These macros are a clever trick to determine a count of the number of LSMs that are enabled in the config to ascertain the maximum number of static calls that need to be configured per LSM hook. Without this one would need to generate static calls for the total number of LSMs in the kernel (even if they are not compiled) times the number of LSM hooks which ends up being quite wasteful. Tested-by: Guenter Roeck <linux@roeck-us.net> Suggested-by: Kui-Feng Lee <sinquersw@gmail.com> Suggested-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Casey Schaufler <casey@schaufler-ca.com> Reviewed-by: John Johansen <john.johansen@canonical.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Acked-by: Song Liu <song@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Nacked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: KP Singh <kpsingh@kernel.org> [PM: added IPE to the count during merge] Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-08-22thermal: core: Unexport thermal_bind_cdev_to_trip() and ↵Rafael J. Wysocki
thermal_unbind_cdev_from_trip() Since thermal_bind_cdev_to_trip() and thermal_unbind_cdev_from_trip() are only called locally in the thermal core now, they can be static, so change their definitions accordingly and drop their headers from the global thermal header file. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Acked-by: Huisong Li <lihuisong@huawei.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/3512161.QJadu78ljV@rjwysocki.net
2024-08-22thermal: core: Introduce .should_bind() thermal zone callbackRafael J. Wysocki
The current design of the code binding cooling devices to trip points in thermal zones is convoluted and hard to follow. Namely, a driver that registers a thermal zone can provide .bind() and .unbind() operations for it, which are required to call either thermal_bind_cdev_to_trip() and thermal_unbind_cdev_from_trip(), respectively, or thermal_zone_bind_cooling_device() and thermal_zone_unbind_cooling_device(), respectively, for every relevant trip point and the given cooling device. Moreover, if .bind() is provided and .unbind() is not, the cleanup necessary during the removal of a thermal zone or a cooling device may not be carried out. In other words, the core relies on the thermal zone owners to do the right thing, which is error prone and far from obvious, even though all of that is not really necessary. Specifically, if the core could ask the thermal zone owner, through a special thermal zone callback, whether or not a given cooling device should be bound to a given trip point in the given thermal zone, it might as well carry out all of the binding and unbinding by itself. In particular, the unbinding can be done automatically without involving the thermal zone owner at all because all of the thermal instances associated with a thermal zone or cooling device going away must be deleted regardless. Accordingly, introduce a new thermal zone operation, .should_bind(), that can be invoked by the thermal core for a given thermal zone, trip point and cooling device combination in order to check whether or not the cooling device should be bound to the trip point at hand. It takes an additional cooling_spec argument allowing the thermal zone owner to specify the highest and lowest cooling states of the cooling device and its weight for the given trip point binding. Make the thermal core use this operation, if present, in the absence of .bind() and .unbind(). Note that .should_bind() will be called under the thermal zone lock. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Acked-by: Huisong Li <lihuisong@huawei.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/9334403.CDJkKcVGEf@rjwysocki.net
2024-08-22bpf: rename nocsr -> bpf_fastcall in verifierEduard Zingerman
Attribute used by LLVM implementation of the feature had been changed from no_caller_saved_registers to bpf_fastcall (see [1]). This commit replaces references to nocsr by references to bpf_fastcall to keep LLVM and Kernel parts in sync. [1] https://github.com/llvm/llvm-project/pull/105417 Acked-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20240822084112.3257995-2-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-08-22ASoC: cs35l56: Make struct regmap_config constRichard Fitzgerald
It's now possible to declare instances of struct regmap_config as const data. Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Link: https://patch.msgid.link/20240822145535.336407-1-rf@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-08-22ASoC: remove snd_soc_dai_link_set_capabilities()Kuninori Morimoto
dpcm_xxx flags are no longer needed. We need to use xxx_only flags instead if needed, but snd_soc_dai_link_set_capabilities() user adds dpcm_xxx if playback/capture were available. Thus converting dpcm_xxx to xxx_only is not needed. Just remove it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Tested-by: Jerome Brunet <jbrunet@baylibre.com> Link: https://patch.msgid.link/87r0aiaahh.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-08-22ASoC: soc-pcm: Indicate warning if dpcm_playback/capture were used for ↵Kuninori Morimoto
availability limition I have been wondering why DPCM needs special flag (= dpcm_playback/capture) to use it. Below is the history why it was added to ASoC. (A) In beginning, there was no dpcm_xxx flag on ASoC. It checks channels_min for DPCM, same as current non-DPCM. Let's name it as "validation check" here. if (rtd->dai_link->dynamic || rtd->dai_link->no_pcm) { if (cpu_dai->driver->playback.channels_min) playback = 1; if (cpu_dai->driver->capture.channels_min) capture = 1; (B) commit 1e9de42f4324 ("ASoC: dpcm: Explicitly set BE DAI link supported stream directions") force to use dpcm_xxx flag on DPCM. According to this commit log, this is because "Some BE dummy DAI doesn't set channels_min for playback/capture". But we don't know which DAI is it, and not know why it can't/don't have channels_min. Let's name it as "no_chan_DAI" here. According to the code and git-log, it is used as DCPM-BE and is CPU DAI. I think the correct solution was set channels_min on "no_chan_DAI" side, not update ASoC framework side. But everything is under smoke today. if (rtd->dai_link->dynamic || rtd->dai_link->no_pcm) { playback = rtd->dai_link->dpcm_playback; capture = rtd->dai_link->dpcm_capture; (C) commit 9b5db059366a ("ASoC: soc-pcm: dpcm: Only allow playback/capture if supported") checks channels_min (= validation check) again. Because DPCM availability was handled by dpcm_xxx flag at that time, but some Sound Card set it even though it wasn't available. Clearly there's a contradiction here. I think correct solution was update Sound Card side instead of ASoC framework. Sound Card side will be updated to handle this issue later (commit 25612477d20b ("ASoC: soc-dai: set dai_link dpcm_ flags with a helper")) if (rtd->dai_link->dynamic || rtd->dai_link->no_pcm) { ... playback = rtd->dai_link->dpcm_playback && snd_soc_dai_stream_valid(cpu_dai, ...); capture = rtd->dai_link->dpcm_capture && snd_soc_dai_stream_valid(cpu_dai, ...); This (C) patch should have broken "no_chan_DAI" which doesn't have channels_min, but there was no such report during this 4 years. Possibilities case are as follows - No one is using "no_chan_DAI" - "no_chan_DAI" is no longer exist : was removed ? - "no_chan_DAI" is no longer exist : has channels_min ? Because of these history, this dpcm_xxx is unneeded flag today. But because we have been used it for 10 years since (B), it may have been used differently. For example some DAI available both playback/capture, but it set dpcm_playback flag only, in this case dpcm_xxx flag is used as availability limitation. We can use playback_only flag instead in this case, but it is very difficult to find such DAI today. Let's add grace time to remove dpcm_playback/capture flag. This patch don't use dpcm_xxx flag anymore, and indicates warning to use xxx_only flag if both playback/capture were available but using only one of dpcm_xxx flag, and not using xxx_only flag. Link: https://lore.kernel.org/r/87edaym2cg.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Tested-by: Jerome Brunet <jbrunet@baylibre.com> Link: https://patch.msgid.link/87seuyaahn.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-08-22ASoC: tas2781: mark const variables tas2563_dvc_table as __maybe_unusedShenghao Ding
In case of tas2781, tas2563_dvc_table will be unused, so mark it as __maybe_unused. Signed-off-by: Shenghao Ding <shenghao-ding@ti.com> Link: https://patch.msgid.link/20240822063205.662-1-shenghao-ding@ti.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-08-22USB: gadget: f_hid: Add GET_REPORT via userspace IOCTLChris Wulff
While supporting GET_REPORT is a mandatory request per the HID specification the current implementation of the GET_REPORT request responds to the USB Host with an empty reply of the request length. However, some USB Hosts will request the contents of feature reports via the GET_REPORT request. In addition, some proprietary HID 'protocols' will expect different data, for the same report ID, to be to become available in the feature report by sending a preceding SET_REPORT to the USB Device that defines what data is to be presented when that feature report is subsequently retrieved via GET_REPORT (with a very fast < 5ms turn around between the SET_REPORT and the GET_REPORT). There are two other patch sets already submitted for adding GET_REPORT support. The first [1] allows for pre-priming a list of reports via IOCTLs which then allows the USB Host to perform the request, with no further userspace interaction possible during the GET_REPORT request. And another [2] which allows for a single report to be setup by userspace via IOCTL, which will be fetched and returned by the kernel for subsequent GET_REPORT requests by the USB Host, also with no further userspace interaction possible. This patch, while loosely based on both the patch sets, differs by allowing the option for userspace to respond to each GET_REPORT request by setting up a poll to notify userspace that a new GET_REPORT request has arrived. To support this, two extra IOCTLs are supplied. The first of which is used to retrieve the report ID of the GET_REPORT request (in the case of having non-zero report IDs in the HID descriptor). The second IOCTL allows for storing report responses in a list for responding to requests. The report responses are stored in a list (it will be either added if it does not exist or updated if it exists already). A flag (userspace_req) can be set to whether subsequent requests notify userspace or not. Basic operation when a GET_REPORT request arrives from USB Host: - If the report ID exists in the list and it is set for immediate return (i.e. userspace_req == false) then response is sent immediately, userspace is not notified - The report ID does not exist, or exists but is set to notify userspace (i.e. userspace_req == true) then notify userspace via poll: - If userspace responds, and either adds or update the response in the list and respond to the host with the contents - If userspace does not respond within the fixed timeout (2500ms) but the report has been set prevously, then send 'old' report contents - If userspace does not respond within the fixed timeout (2500ms) and the report does not exist in the list then send an empty report Note that userspace could 'prime' the report list at any other time. While this patch allows for flexibility in how the system responds to requests, and therefore the HID 'protocols' that could be supported, a drawback is the time it takes to service the requests and therefore the maximum throughput that would be achievable. The USB HID Specification v1.11 itself states that GET_REPORT is not intended for periodic data polling, so this limitation is not severe. Testing on an iMX8M Nano Ultra Lite with a heavy multi-core CPU loading showed that userspace can typically respond to the GET_REPORT request within 1200ms - which is well within the 5000ms most operating systems seem to allow, and within the 2500ms set by this patch. [1] https://lore.kernel.org/all/20220805070507.123151-2-sunil@amarulasolutions.com/ [2] https://lore.kernel.org/all/20220726005824.2817646-1-vi@endrift.com/ Signed-off-by: David Sands <david.sands@biamp.com> Signed-off-by: Chris Wulff <chris.wulff@biamp.com> Link: https://lore.kernel.org/r/20240817142850.1311460-2-crwulff@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-22net: ipv6: ioam6: new feature tunsrcJustin Iurman
This patch provides a new feature (i.e., "tunsrc") for the tunnel (i.e., "encap") mode of ioam6. Just like seg6 already does, except it is attached to a route. The "tunsrc" is optional: when not provided (by default), the automatic resolution is applied. Using "tunsrc" when possible has a benefit: performance. See the comparison: - before (= "encap" mode): https://ibb.co/bNCzvf7 - after (= "encap" mode with "tunsrc"): https://ibb.co/PT8L6yq Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-22Merge tag 'drm-misc-next-2024-08-16' of ↵Daniel Vetter
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for v6.12: Core Changes: ci: - Update dependencies docs: - Cleanups edid: - Improve debug logging - Clean up interface fbdev emulation: - Remove old fbdev hooks - Update documentation panic: - Cleanups Driver Changes: amdgpu: - Remove usage of old fbdev hooks - Use backlight constants ast: - Fix timeout loop for DP link training hisilicon: - hibmc: Cleanups mipi-dsi: - Improve error handling - startek-kd070fhfid015: Use new error handling nouveau: - Remove usage of old fbdev hooks panel: - Use backlight constants radeon: - Use backlight constants rockchip: - Improve DP sink-capability reporting - Cleanups - dw_hdmi: Support 4k@60Hz; Cleanups - vop: Support RGB display on Rockchip RK3066; Support 4096px width tilcdc: - Use backlight constants Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240816084109.GA229316@localhost.localdomain
2024-08-22dma-mapping: direct calls for dma-iommuLeon Romanovsky
Directly call into dma-iommu just like we have been doing for dma-direct for a while. This avoids the indirect call overhead for IOMMU ops and removes the need to have DMA ops entirely for many common configurations. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2024-08-22dma-mapping: replace zone_dma_bits by zone_dma_limitCatalin Marinas
The hardware DMA limit might not be power of 2. When RAM range starts above 0, say 4GB, DMA limit of 30 bits should end at 5GB. A single high bit can not encode this limit. Use a plain address for the DMA zone limit instead. Since the DMA zone can now potentially span beyond 4GB physical limit of DMA32, make sure to use DMA zone for GFP_DMA32 allocations in that case. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Co-developed-by: Baruch Siach <baruch@tkos.co.il> Signed-off-by: Baruch Siach <baruch@tkos.co.il> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Petr Tesarik <ptesarik@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2024-08-21net: repack struct netdev_queueJakub Kicinski
Adding the NAPI pointer to struct netdev_queue made it grow into another cacheline, even though there was 44 bytes of padding available. The struct was historically grouped as follows: /* read-mostly stuff (align) */ /* ... random control path fields ... */ /* write-mostly stuff (align) */ /* ... 40 byte hole ... */ /* struct dql (align) */ It seems that people want to add control path fields after the read only fields. struct dql looks pretty innocent but it forces its own alignment and nothing indicates that there is a lot of empty space above it. Move dql above the xmit_lock. This shifts the empty space to the end of the struct rather than in the middle of it. Move two example fields there to set an example. Hopefully people will now add new fields at the end of the struct. A lot of the read-only stuff is also control path-only, but if we move it all we'll have another hole in the middle. Before: /* size: 384, cachelines: 6, members: 16 */ /* sum members: 284, holes: 3, sum holes: 100 */ After: /* size: 320, cachelines: 5, members: 16 */ /* sum members: 284, holes: 1, sum holes: 8 */ Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20240820205119.1321322-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-21drm/fourcc: define Intel Xe2 related tile4 ccs modifiersJuha-Pekka Heikkila
Add Tile4 type ccs modifiers to indicate presence of compression on Xe2. Here is defined I915_FORMAT_MOD_4_TILED_LNL_CCS which is meant for integrated graphics with igpu related limitations Here is also defined I915_FORMAT_MOD_4_TILED_BMG_CCS which is meant for discrete graphics with dgpu related limitations Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240816115229.531671-3-juhapekka.heikkila@gmail.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-21bpf: extract iterator argument type and name validation logicAndrii Nakryiko
Verifier enforces that all iterator structs are named `bpf_iter_<name>` and that whenever iterator is passed to a kfunc it's passed as a valid PTR -> STRUCT chain (with potentially const modifiers in between). We'll need this check for upcoming changes, so instead of duplicating the logic, extract it into a helper function. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20240808232230.2848712-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-08-21workqueue: Fix another htmldocs build warningTejun Heo
Fix htmldocs build warning introduced by 9b59a85a84dc ("workqueue: Don't call va_start / va_end twice"). Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Matthew Brost <matthew.brost@intel.com>
2024-08-21soc: qcom: pmic_glink: Fix race during initializationBjorn Andersson
As pointed out by Stephen Boyd it is possible that during initialization of the pmic_glink child drivers, the protection-domain notifiers fires, and the associated work is scheduled, before the client registration returns and as a result the local "client" pointer has been initialized. The outcome of this is a NULL pointer dereference as the "client" pointer is blindly dereferenced. Timeline provided by Stephen: CPU0 CPU1 ---- ---- ucsi->client = NULL; devm_pmic_glink_register_client() client->pdr_notify(client->priv, pg->client_state) pmic_glink_ucsi_pdr_notify() schedule_work(&ucsi->register_work) <schedule away> pmic_glink_ucsi_register() ucsi_register() pmic_glink_ucsi_read_version() pmic_glink_ucsi_read() pmic_glink_ucsi_read() pmic_glink_send(ucsi->client) <client is NULL BAD> ucsi->client = client // Too late! This code is identical across the altmode, battery manager and usci child drivers. Resolve this by splitting the allocation of the "client" object and the registration thereof into two operations. This only happens if the protection domain registry is populated at the time of registration, which by the introduction of commit '1ebcde047c54 ("soc: qcom: add pd-mapper implementation")' became much more likely. Reported-by: Amit Pundir <amit.pundir@linaro.org> Closes: https://lore.kernel.org/all/CAMi1Hd2_a7TjA7J9ShrAbNOd_CoZ3D87twmO5t+nZxC9sX18tA@mail.gmail.com/ Reported-by: Johan Hovold <johan@kernel.org> Closes: https://lore.kernel.org/all/ZqiyLvP0gkBnuekL@hovoldconsulting.com/ Reported-by: Stephen Boyd <swboyd@chromium.org> Closes: https://lore.kernel.org/all/CAE-0n52JgfCBWiFQyQWPji8cq_rCsviBpW-m72YitgNfdaEhQg@mail.gmail.com/ Fixes: 58ef4ece1e41 ("soc: qcom: pmic_glink: Introduce base PMIC GLINK driver") Cc: stable@vger.kernel.org Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Tested-by: Amit Pundir <amit.pundir@linaro.org> Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Acked-by: Sebastian Reichel <sebastian.reichel@collabora.com> Tested-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com> Link: https://lore.kernel.org/r/20240820-pmic-glink-v6-11-races-v3-1-eec53c750a04@quicinc.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2024-08-21printk: nbcon: Implement emergency sectionsThomas Gleixner
In emergency situations (something has gone wrong but the system continues to operate), usually important information (such as a backtrace) is generated via printk(). This information should be pushed out to the consoles ASAP. Add per-CPU emergency nesting tracking because an emergency can arise while in an emergency situation. Add functions to mark the beginning and end of emergency sections where the urgent messages are generated. Perform direct console flushing at the emergency priority if the current CPU is in an emergency state and it is safe to do so. Note that the emergency state is not system-wide. While one CPU is in an emergency state, another CPU may attempt to print console messages at normal priority. Also note that printk() already attempts to flush consoles in the caller context for normal priority. However, follow-up changes will introduce printing kthreads, in which case the normal priority printk() calls will offload to the kthreads. Co-developed-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Thomas Gleixner (Intel) <tglx@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-32-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: Coordinate direct printing in panicJohn Ogness
If legacy and nbcon consoles are registered and the nbcon consoles are allowed to flush (i.e. no boot consoles registered), the legacy consoles will no longer perform direct printing on the panic CPU until after the backtrace has been stored. This will give the safe nbcon consoles a chance to print the panic messages before allowing the unsafe legacy consoles to print. If no nbcon consoles are registered or they are not allowed to flush because boot consoles are registered, there is no change in behavior (i.e. legacy consoles will always attempt to print from the printk() caller context). Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-30-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: nbcon: Add unsafe flushing on panicJohn Ogness
Add nbcon_atomic_flush_unsafe() to flush all nbcon consoles using the write_atomic() callback and allowing unsafe hostile takeovers. Call this at the end of panic() as a final attempt to flush any pending messages. Note that legacy consoles use unsafe methods for flushing from the beginning of panic (see bust_spinlocks()). Therefore, systems using both legacy and nbcon consoles may still fail to see panic messages due to unsafe legacy console usage. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-27-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21serial: core: Acquire nbcon context in port->lock wrapperJohn Ogness
Currently the port->lock wrappers uart_port_lock(), uart_port_unlock() (and their variants) only lock/unlock the spin_lock. If the port is an nbcon console that has implemented the write_atomic() callback, the wrappers must also acquire/release the console context and mark the region as unsafe. This allows general port->lock synchronization to be synchronized against the nbcon write_atomic() callback. Note that __uart_port_using_nbcon() relies on the port->lock being held while a console is added and removed from the console list (i.e. all uart nbcon drivers *must* take the port->lock in their device_lock() callbacks). Signed-off-by: John Ogness <john.ogness@linutronix.de> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-15-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21nbcon: Add API to acquire context for non-printing operationsJohn Ogness
Provide functions nbcon_device_try_acquire() and nbcon_device_release() which will try to acquire the nbcon console ownership with NBCON_PRIO_NORMAL and mark it unsafe for handover/takeover. These functions are to be used together with the device-specific locking when performing non-printing activities on the console device. They will allow synchronization against the atomic_write() callback which will be serialized, for higher priority contexts, only by acquiring the console context ownership. Pitfalls: The API requires to be called in a context with migration disabled because it uses per-CPU variables internally. The context is set unsafe for a takeover all the time. It guarantees full serialization against any atomic_write() caller except for the final flush in panic() which might try an unsafe takeover. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-14-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21console: Improve console_srcu_read_flags() commentsJohn Ogness
It was not clear when exactly console_srcu_read_flags() must be used vs. directly reading @console->flags. Refactor and clarify that console_srcu_read_flags() is only needed if the console is registered or the caller is in a context where the registration status of the console may change (due to another context). The function requires the caller holds @console_srcu, which will ensure that the caller sees an appropriate @flags value for the registered console and that exit/cleanup routines will not run if the console is in the process of unregistration. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-13-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21serial: core: Introduce wrapper to set @uart_port->consJohn Ogness
Introduce uart_port_set_cons() as a wrapper to set @cons of a uart_port. The wrapper sets @cons under the port lock in order to prevent @cons from disappearing while another context is holding the port lock. This is necessary for a follow-up commit relating to the port lock wrappers, which rely on @cons not changing between lock and unlock. Signed-off-by: John Ogness <john.ogness@linutronix.de> Tested-by: Théo Lebrun <theo.lebrun@bootlin.com> # EyeQ5, AMBA-PL011 Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20240820063001.36405-12-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21serial: core: Provide low-level functions to lock portJohn Ogness
It will be necessary at times for the uart nbcon console drivers to acquire the port lock directly (without the additional nbcon functionality of the port lock wrappers). These are special cases such as the implementation of the device_lock()/device_unlock() callbacks or for internal port lock wrapper synchronization. Provide low-level variants __uart_port_lock_irqsave() and __uart_port_unlock_irqrestore() for this purpose. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20240820063001.36405-11-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: nbcon: Add callbacks to synchronize with driverJohn Ogness
Console drivers typically must deal with access to the hardware via user input/output (such as an interactive login shell) and output of kernel messages via printk() calls. To provide the necessary synchronization, usually some driver-specific locking mechanism is used (for example, the port spinlock for uart serial consoles). Until now, usage of this driver-specific locking has been hidden from the printk-subsystem and implemented within the various console callbacks. However, nbcon consoles would need to use it even in the generic code. Add device_lock() and device_unlock() callback which will need to get implemented by nbcon consoles. The callbacks will use whatever synchronization mechanism the driver is using for itself. The minimum requirement is to prevent CPU migration. It would allow a context friendly acquiring of nbcon console ownership in non-emergency and non-panic context. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-9-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: nbcon: Add detailed doc for write_atomic()John Ogness
The write_atomic() callback has special requirements and is allowed to use special helper functions. Provide detailed documentation of the callback so that a developer has a chance of implementing it correctly. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-8-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: nbcon: Remove return value for write_atomic()John Ogness
The return value of write_atomic() does not provide any useful information. On the contrary, it makes things more complicated for the caller to appropriately deal with the information. Change write_atomic() to not have a return value. If the message did not get printed due to loss of ownership, the caller will notice this on its own. If ownership was not lost, it will be assumed that the driver successfully printed the message and the sequence number for that console will be incremented. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-7-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>