summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2022-11-08Merge git://linuxtv.org/sailus/media_tree into media_stageMauro Carvalho Chehab
* git://linuxtv.org/sailus/media_tree: (47 commits) media: i2c: ov4689: code cleanup media: ov9650: Drop platform data code path media: ov7670: Drop unused include media: ov2640: Drop legacy includes media: tc358746: add Toshiba TC358746 Parallel to CSI-2 bridge driver media: dt-bindings: add bindings for Toshiba TC358746 phy: dphy: add support to calculate the timing based on hs_clk_rate phy: dphy: refactor get_default_config v4l: subdev: Warn if disabling streaming failed, return success dw9768: Enable low-power probe on ACPI media: i2c: imx290: Replace GAIN control with ANALOGUE_GAIN media: i2c: imx290: Add crop selection targets support media: i2c: imx290: Factor out format retrieval to separate function media: i2c: imx290: Move registers with fixed value to init array media: i2c: imx290: Create controls for fwnode properties media: i2c: imx290: Implement HBLANK and VBLANK controls media: i2c: imx290: Split control initialization to separate function media: i2c: imx290: Fix max gain value media: i2c: imx290: Add exposure time control media: i2c: imx290: Define more register macros ...
2022-11-07mm/percpu: remove unused PERCPU_DYNAMIC_EARLY_SLOTSBaoquan He
Since commit 40064aeca35c ("percpu: replace area map allocator with bitmap"), there's no place to use PERCPU_DYNAMIC_EARLY_SLOTS. So clean it up. Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Dennis Zhou <dennis@kernel.org>
2022-11-07net: remove explicit phylink_generic_validate() referencesRussell King (Oracle)
Virtually all conventional network drivers are now converted to use phylink_generic_validate() - only DSA drivers and fman_memac remain, so lets remove the necessity for network drivers to explicitly set this member, and default to phylink_generic_validate() when unset. This is possible as .validate must currently be set. Any remaining instances that have not been addressed by this patch can be fixed up later. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/E1or0FZ-001tRa-DI@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-07mtd: nand: drop EXPORT_SYMBOL_GPL for nanddev_erase()Dario Binacchi
This function is only used within this module, so it is no longer necessary to use EXPORT_SYMBOL_GPL(). Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20221018170205.1733958-1-dario.binacchi@amarulasolutions.com
2022-11-07arm_pmu: rework ACPI probingMark Rutland
The current ACPI PMU probing logic tries to associate PMUs with CPUs when the CPU is first brought online, in order to handle late hotplug, though PMUs are only registered during early boot, and so for late hotplugged CPUs this can only associate the CPU with an existing PMU. We tried to be clever and the have the arm_pmu_acpi_cpu_starting() callback allocate a struct arm_pmu when no matching instance is found, in order to avoid duplication of logic. However, as above this doesn't do anything useful for late hotplugged CPUs, and this requires us to allocate memory in an atomic context, which is especially problematic for PREEMPT_RT, as reported by Valentin and Pierre. This patch reworks the probing to detect PMUs for all online CPUs in the arm_pmu_acpi_probe() function, which is more aligned with how DT probing works. The arm_pmu_acpi_cpu_starting() callback only tries to associate CPUs with an existing arm_pmu instance, avoiding the problem of allocating in atomic context. Note that as we didn't previously register PMUs for late-hotplugged CPUs, this change doesn't result in a loss of existing functionality, though we will now warn when we cannot associate a CPU with a PMU. This change allows us to pull the hotplug callback registration into the arm_pmu_acpi_probe() function, as we no longer need the callbacks to be invoked shortly after probing the boot CPUs, and can register it without invoking the calls. For the moment the arm_pmu_acpi_init() initcall remains to register the SPE PMU, though in future this should probably be moved elsewhere (e.g. the arm64 ACPI init code), since this doesn't need to be tied to the regular CPU PMU code. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reported-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lore.kernel.org/r/20210810134127.1394269-2-valentin.schneider@arm.com/ Reported-by: Pierre Gondois <pierre.gondois@arm.com> Link: https://lore.kernel.org/linux-arm-kernel/20220912155105.1443303-1-pierre.gondois@arm.com/ Cc: Pierre Gondois <pierre.gondois@arm.com> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Will Deacon <will@kernel.org> Reviewed-and-tested-by: Pierre Gondois <pierre.gondois@arm.com> Link: https://lore.kernel.org/r/20220930111844.1522365-4-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2022-11-07ACPI: ARM Performance Monitoring Unit Table (APMT) initial supportBesar Wicaksono
ARM Performance Monitoring Unit Table describes the properties of PMU support in ARM-based system. The APMT table contains a list of nodes, each represents a PMU in the system that conforms to ARM CoreSight PMU architecture. The properties of each node include information required to access the PMU (e.g. MMIO base address, interrupt number) and also identification. For more detailed information, please refer to the specification below: * APMT: https://developer.arm.com/documentation/den0117/latest * ARM Coresight PMU: https://developer.arm.com/documentation/ihi0091/latest The initial support adds the detection of APMT table and generic infrastructure to create platform devices for ARM CoreSight PMUs. Similar to IORT the root pointer of APMT is preserved during runtime and each PMU platform device is given a pointer to the corresponding APMT node. Signed-off-by: Besar Wicaksono <bwicaksono@nvidia.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Link: https://lore.kernel.org/r/20220929002834.32664-1-bwicaksono@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2022-11-07can: dev: fix skb drop checkOliver Hartkopp
In commit a6d190f8c767 ("can: skb: drop tx skb if in listen only mode") the priv->ctrlmode element is read even on virtual CAN interfaces that do not create the struct can_priv at startup. This out-of-bounds read may lead to CAN frame drops for virtual CAN interfaces like vcan and vxcan. This patch mainly reverts the original commit and adds a new helper for CAN interface drivers that provide the required information in struct can_priv. Fixes: a6d190f8c767 ("can: skb: drop tx skb if in listen only mode") Reported-by: Dariusz Stojaczyk <Dariusz.Stojaczyk@opensynergy.com> Cc: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Cc: Max Staudt <max@enpas.org> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Acked-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://lore.kernel.org/all/20221102095431.36831-1-socketcan@hartkopp.net Cc: stable@vger.kernel.org # 6.0.x [mkl: patch pch_can, too] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-11-07net: mv643xx_eth: support MII/GMII/RGMII modes for KirkwoodDavid Yang
Support mode switch properly, which is not available before. If SoC has two Ethernet controllers, by setting both of them into MII mode, the first controller enters GMII mode, while the second controller is effectively disabled. This requires configuring (and maybe enabling) the second controller in the device tree, even though it cannot be used. Signed-off-by: David Yang <mmyangfl@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-04lsm: make security_socket_getpeersec_stream() sockptr_t safePaul Moore
Commit 4ff09db1b79b ("bpf: net: Change sk_getsockopt() to take the sockptr_t argument") made it possible to call sk_getsockopt() with both user and kernel address space buffers through the use of the sockptr_t type. Unfortunately at the time of conversion the security_socket_getpeersec_stream() LSM hook was written to only accept userspace buffers, and in a desire to avoid having to change the LSM hook the commit author simply passed the sockptr_t's userspace buffer pointer. Since the only sk_getsockopt() callers at the time of conversion which used kernel sockptr_t buffers did not allow SO_PEERSEC, and hence the security_socket_getpeersec_stream() hook, this was acceptable but also very fragile as future changes presented the possibility of silently passing kernel space pointers to the LSM hook. There are several ways to protect against this, including careful code review of future commits, but since relying on code review to catch bugs is a recipe for disaster and the upstream eBPF maintainer is "strongly against defensive programming", this patch updates the LSM hook, and all of the implementations to support sockptr_t and safely handle both user and kernel space buffers. Acked-by: Casey Schaufler <casey@schaufler-ca.com> Acked-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2022-11-04bpf: Convert BPF_DISPATCHER to use static_call() (not ftrace)Peter Zijlstra
The dispatcher function is currently abusing the ftrace __fentry__ call location for its own purposes -- this obviously gives trouble when the dispatcher and ftrace are both in use. A previous solution tried using __attribute__((patchable_function_entry())) which works, except it is GCC-8+ only, breaking the build on the earlier still supported compilers. Instead use static_call() -- which has its own annotations and does not conflict with ftrace -- to rewrite the dispatch function. By using: return static_call()(ctx, insni, bpf_func) you get a perfect forwarding tail call as function body (iow a single jmp instruction). By having the default static_call() target be bpf_dispatcher_nop_func() it retains the default behaviour (an indirect call to the argument function). Only once a dispatcher program is attached is the target rewritten to directly call the JIT'ed image. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Björn Töpel <bjorn@kernel.org> Tested-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Björn Töpel <bjorn@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lkml.kernel.org/r/Y1/oBlK0yFk5c/Im@hirez.programming.kicks-ass.net Link: https://lore.kernel.org/bpf/20221103120647.796772565@infradead.org
2022-11-04bpf: Revert ("Fix dispatcher patchable function entry to 5 bytes nop")Peter Zijlstra
Because __attribute__((patchable_function_entry)) is only available since GCC-8 this solution fails to build on the minimum required GCC version. Undo these changes so we might try again -- without cluttering up the patches with too many changes. This is an almost complete revert of: dbe69b299884 ("bpf: Fix dispatcher patchable function entry to 5 bytes nop") ceea991a019c ("bpf: Move bpf_dispatcher function out of ftrace locations") (notably the arch/x86/Kconfig hunk is kept). Reported-by: David Laight <David.Laight@aculab.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Björn Töpel <bjorn@kernel.org> Tested-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Björn Töpel <bjorn@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lkml.kernel.org/r/439d8dc735bb4858875377df67f1b29a@AcuMS.aculab.com Link: https://lore.kernel.org/bpf/20221103120647.728830733@infradead.org
2022-11-04Merge tag 'hardening-v6.1-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening fix from Kees Cook: - Correctly report struct member size on memcpy overflow (Kees Cook) * tag 'hardening-v6.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: fortify: Capture __bos() results in const temp vars
2022-11-04Merge tag 'efi-fixes-for-v6.1-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fixes from Ard Biesheuvel: - A pair of tweaks to the EFI random seed code so that externally provided version of this config table are handled more robustly - Another fix for the v6.0 EFI variable refactor that turned out to break Apple machines which don't provide QueryVariableInfo() - Add some guard rails to the EFI runtime service call wrapper so we can recover from synchronous exceptions caused by firmware * tag 'efi-fixes-for-v6.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: arm64: efi: Recover from synchronous exceptions occurring in firmware efi: efivars: Fix variable writes with unsupported query_variable_store() efi: random: Use 'ACPI reclaim' memory for random seed efi: random: reduce seed size to 32 bytes efi/tpm: Pass correct address to memblock_reserve
2022-11-04mm/slab: remove !CONFIG_TRACING variants of kmalloc_[node_]trace()Vlastimil Babka
For !CONFIG_TRACING kernels, the kmalloc() implementation tries (in cases where the allocation size is build-time constant) to save a function call, by inlining kmalloc_trace() to a kmem_cache_alloc() call. However since commit 6edf2576a6cc ("mm/slub: enable debugging memory wasting of kmalloc") this path now fails to pass the original request size to be eventually recorded (for kmalloc caches with debugging enabled). We could adjust the code to call __kmem_cache_alloc_node() as the CONFIG_TRACING variant, but that would as a result inline a call with 5 parameters, bloating the kmalloc() call sites. The cost of extra function call (to kmalloc_trace()) seems like a lesser evil. It also appears that the !CONFIG_TRACING variant is incompatible with upcoming hardening efforts [1] so it's easier if we just remove it now. Kernels with no tracing are rare these days and the benefit is dubious anyway. [1] https://lore.kernel.org/linux-mm/20221101222520.never.109-kees@kernel.org/T/#m20ecf14390e406247bde0ea9cce368f469c539ed Link: https://lore.kernel.org/all/097d8fba-bd10-a312-24a3-a4068c4f424c@suse.cz/ Suggested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2022-11-04spi: Merge spi_controller.{slave,target}_abort()Geert Uytterhoeven
Mixing SPI slave/target handlers and SPI slave/target controllers using legacy and modern naming does not work well: there are now two different callbacks for aborting a slave/target operation, of which only one is populated, while spi_{slave,target}_abort() check and use only one, which may be the unpopulated one. Fix this by merging the slave/target abort callbacks into a single callback using a union, like is already done for the slave/target flags. Fixes: b8d3b056a78dcc94 ("spi: introduce new helpers with using modern naming") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/809c82d54b85dd87ef7ee69fc93016085be85cec.1667555967.git.geert+renesas@glider.be Signed-off-by: Mark Brown <broonie@kernel.org>
2022-11-04bcma: Use the proper gpio includeLinus Walleij
The <linux/bcma/bcma_driver_chipcommon.h> is including the legacy header <linux/gpio.h> to obtain struct gpio_chip. Instead, include <linux/gpio/driver.h> where this struct is defined. It turns out that the brcm80211 brcmsmac depends on this to bring in the symbol gpio_is_valid(). The driver looks up the BCMA parent GPIO driver and checks that this succeeds, but then it goes on to use the deprecated GPIO call gpio_is_valid() to check the consistency of the .base member of the BCMA GPIO struct. The whole check can be dropped because the bcma_gpio is initialized in the declarations: struct gpio_chip *bcma_gpio = &cc_drv->gpio; And this can never be NULL. Cc: Jonas Gorski <jonas.gorski@gmail.com> Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20221028092332.238728-1-linus.walleij@linaro.org
2022-11-04Merge tag 'drm-intel-gt-next-2022-11-03' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next Driver Changes: - Fix for #7306: [Arc A380] white flickering when using arc as a secondary gpu (Matt A) - Add Wa_18017747507 for DG2 (Wayne) - Avoid spurious WARN on DG1 due to incorrect cache_dirty flag (Niranjana, Matt A) - Corrections to CS timestamp support for Gen5 and earlier (Ville) - Fix a build error used with clang compiler on hwmon (GG) - Improvements to LMEM handling with RPM (Anshuman, Matt A) - Cleanups in dmabuf code (Mike) - Selftest improvements (Matt A) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Y2N11wu175p6qeEN@jlahtine-mobl.ger.corp.intel.com
2022-11-03bpf: Refactor map->off_arr handlingKumar Kartikeya Dwivedi
Refactor map->off_arr handling into generic functions that can work on their own without hardcoding map specific code. The btf_fields_offs structure is now returned from btf_parse_field_offs, which can be reused later for types in program BTF. All functions like copy_map_value, zero_map_value call generic underlying functions so that they can also be reused later for copying to values allocated in programs which encode specific fields. Later, some helper functions will also require access to this btf_field_offs structure to be able to skip over special fields at runtime. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20221103191013.1236066-9-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-11-03bpf: Consolidate spin_lock, timer management into btf_recordKumar Kartikeya Dwivedi
Now that kptr_off_tab has been refactored into btf_record, and can hold more than one specific field type, accomodate bpf_spin_lock and bpf_timer as well. While they don't require any more metadata than offset, having all special fields in one place allows us to share the same code for allocated user defined types and handle both map values and these allocated objects in a similar fashion. As an optimization, we still keep spin_lock_off and timer_off offsets in the btf_record structure, just to avoid having to find the btf_field struct each time their offset is needed. This is mostly needed to manipulate such objects in a map value at runtime. It's ok to hardcode just one offset as more than one field is disallowed. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20221103191013.1236066-8-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-11-03bpf: Refactor kptr_off_tab into btf_recordKumar Kartikeya Dwivedi
To prepare the BPF verifier to handle special fields in both map values and program allocated types coming from program BTF, we need to refactor the kptr_off_tab handling code into something more generic and reusable across both cases to avoid code duplication. Later patches also require passing this data to helpers at runtime, so that they can work on user defined types, initialize them, destruct them, etc. The main observation is that both map values and such allocated types point to a type in program BTF, hence they can be handled similarly. We can prepare a field metadata table for both cases and store them in struct bpf_map or struct btf depending on the use case. Hence, refactor the code into generic btf_record and btf_field member structs. The btf_record represents the fields of a specific btf_type in user BTF. The cnt indicates the number of special fields we successfully recognized, and field_mask is a bitmask of fields that were found, to enable quick determination of availability of a certain field. Subsequently, refactor the rest of the code to work with these generic types, remove assumptions about kptr and kptr_off_tab, rename variables to more meaningful names, etc. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20221103191013.1236066-7-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-11-03net: remove unused ndo_get_devlink_portJiri Pirko
Remove ndo_get_devlink_port which is no longer used alongside with the implementations in drivers. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03net: devlink: track netdev with devlink_port assignedJiri Pirko
Currently, ethernet drivers are using devlink_port_type_eth_set() and devlink_port_type_clear() to set devlink port type and link to related netdev. Instead of calling them directly, let the driver use SET_NETDEV_DEVLINK_PORT macro to assign devlink_port pointer and let devlink to track it. Note the devlink port pointer is static during the time netdevice is registered. In devlink code, use per-namespace netdev notifier to track the netdevices with devlink_port assigned and change the internal devlink_port type and related type pointer accordingly. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03bridge: Add MAC Authentication Bypass (MAB) supportHans J. Schultz
Hosts that support 802.1X authentication are able to authenticate themselves by exchanging EAPOL frames with an authenticator (Ethernet bridge, in this case) and an authentication server. Access to the network is only granted by the authenticator to successfully authenticated hosts. The above is implemented in the bridge using the "locked" bridge port option. When enabled, link-local frames (e.g., EAPOL) can be locally received by the bridge, but all other frames are dropped unless the host is authenticated. That is, unless the user space control plane installed an FDB entry according to which the source address of the frame is located behind the locked ingress port. The entry can be dynamic, in which case learning needs to be enabled so that the entry will be refreshed by incoming traffic. There are deployments in which not all the devices connected to the authenticator (the bridge) support 802.1X. Such devices can include printers and cameras. One option to support such deployments is to unlock the bridge ports connecting these devices, but a slightly more secure option is to use MAB. When MAB is enabled, the MAC address of the connected device is used as the user name and password for the authentication. For MAB to work, the user space control plane needs to be notified about MAC addresses that are trying to gain access so that they will be compared against an allow list. This can be implemented via the regular learning process with the sole difference that learned FDB entries are installed with a new "locked" flag indicating that the entry cannot be used to authenticate the device. The flag cannot be set by user space, but user space can clear the flag by replacing the entry, thereby authenticating the device. Locked FDB entries implement the following semantics with regards to roaming, aging and forwarding: 1. Roaming: Locked FDB entries can roam to unlocked (authorized) ports, in which case the "locked" flag is cleared. FDB entries cannot roam to locked ports regardless of MAB being enabled or not. Therefore, locked FDB entries are only created if an FDB entry with the given {MAC, VID} does not already exist. This behavior prevents unauthenticated devices from disrupting traffic destined to already authenticated devices. 2. Aging: Locked FDB entries age and refresh by incoming traffic like regular entries. 3. Forwarding: Locked FDB entries forward traffic like regular entries. If user space detects an unauthorized MAC behind a locked port and wishes to prevent traffic with this MAC DA from reaching the host, it can do so using tc or a different mechanism. Enable the above behavior using a new bridge port option called "mab". It can only be enabled on a bridge port that is both locked and has learning enabled. Locked FDB entries are flushed from the port once MAB is disabled. A new option is added because there are pure 802.1X deployments that are not interested in notifications about locked FDB entries. Signed-off-by: Hans J. Schultz <netdev@kapio-technology.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== bpf 2022-11-04 We've added 8 non-merge commits during the last 3 day(s) which contain a total of 10 files changed, 113 insertions(+), 16 deletions(-). The main changes are: 1) Fix memory leak upon allocation failure in BPF verifier's stack state tracking, from Kees Cook. 2) Fix address leakage when BPF progs release reference to an object, from Youlin Li. 3) Fix BPF CI breakage from buggy in.h uapi header dependency, from Andrii Nakryiko. 4) Fix bpftool pin sub-command's argument parsing, from Pu Lehui. 5) Fix BPF sockmap lockdep warning by cancelling psock work outside of socket lock, from Cong Wang. 6) Follow-up for BPF sockmap to fix sk_forward_alloc accounting, from Wang Yufen. bpf-for-netdev * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: Add verifier test for release_reference() bpf: Fix wrong reg type conversion in release_reference() bpf, sock_map: Move cancel_work_sync() out of sock lock tools/headers: Pull in stddef.h to uapi to fix BPF selftests build in CI net/ipv4: Fix linux/in.h header dependencies bpftool: Fix NULL pointer dereference when pin {PROG, MAP, LINK} without FILE bpf, sockmap: Fix the sk->sk_forward_alloc warning of sk_stream_kill_queues bpf, verifier: Fix memory leak in array reallocation for stack state ==================== Link: https://lore.kernel.org/r/20221104000445.30761-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03bpf: Allow specifying volatile type modifier for kptrsKumar Kartikeya Dwivedi
This is useful in particular to mark the pointer as volatile, so that compiler treats each load and store to the field as a volatile access. The alternative is having to define and use READ_ONCE and WRITE_ONCE in the BPF program. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20221103191013.1236066-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-11-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03regulator: userspace-consumer: Handle regulator-output DT nodesZev Weiss
In addition to adding some fairly simple OF support code, we make some slight adjustments to the userspace-consumer driver to properly support use with regulator-output hardware: - We now do an exclusive get of the supply regulators so as to prevent regulator_init_complete_work from automatically disabling them. - Instead of assuming that the supply is initially disabled, we now query its state to determine the initial value of drvdata->enabled. Signed-off-by: Zev Weiss <zev@bewilderbeest.net> Link: https://lore.kernel.org/r/20221031233704.22575-4-zev@bewilderbeest.net Signed-off-by: Mark Brown <broonie@kernel.org>
2022-11-03regulator: devres: Add devm_regulator_bulk_get_exclusive()Zev Weiss
We had an exclusive variant of the devm_regulator_get() API, but no corresponding variant for the bulk API; let's add one now. We add a generalized version of the existing regulator_bulk_get() function that additionally takes a get_type parameter and redefine regulator_bulk_get() in terms of it, then do similarly with devm_regulator_bulk_get(), and finally add the new devm_regulator_bulk_get_exclusive(). Signed-off-by: Zev Weiss <zev@bewilderbeest.net> Link: https://lore.kernel.org/r/20221031233704.22575-2-zev@bewilderbeest.net Signed-off-by: Mark Brown <broonie@kernel.org>
2022-11-03bpf, sock_map: Move cancel_work_sync() out of sock lockCong Wang
Stanislav reported a lockdep warning, which is caused by the cancel_work_sync() called inside sock_map_close(), as analyzed below by Jakub: psock->work.func = sk_psock_backlog() ACQUIRE psock->work_mutex sk_psock_handle_skb() skb_send_sock() __skb_send_sock() sendpage_unlocked() kernel_sendpage() sock->ops->sendpage = inet_sendpage() sk->sk_prot->sendpage = tcp_sendpage() ACQUIRE sk->sk_lock tcp_sendpage_locked() RELEASE sk->sk_lock RELEASE psock->work_mutex sock_map_close() ACQUIRE sk->sk_lock sk_psock_stop() sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED) cancel_work_sync() __cancel_work_timer() __flush_work() // wait for psock->work to finish RELEASE sk->sk_lock We can move the cancel_work_sync() out of the sock lock protection, but still before saved_close() was called. Fixes: 799aa7f98d53 ("skmsg: Avoid lock_sock() in sk_psock_backlog()") Reported-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20221102043417.279409-1-xiyou.wangcong@gmail.com
2022-11-02Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== bpf-next 2022-11-02 We've added 70 non-merge commits during the last 14 day(s) which contain a total of 96 files changed, 3203 insertions(+), 640 deletions(-). The main changes are: 1) Make cgroup local storage available to non-cgroup attached BPF programs such as tc BPF ones, from Yonghong Song. 2) Avoid unnecessary deadlock detection and failures wrt BPF task storage helpers, from Martin KaFai Lau. 3) Add LLVM disassembler as default library for dumping JITed code in bpftool, from Quentin Monnet. 4) Various kprobe_multi_link fixes related to kernel modules, from Jiri Olsa. 5) Optimize x86-64 JIT with emitting BMI2-based shift instructions, from Jie Meng. 6) Improve BPF verifier's memory type compatibility for map key/value arguments, from Dave Marchevsky. 7) Only create mmap-able data section maps in libbpf when data is exposed via skeletons, from Andrii Nakryiko. 8) Add an autoattach option for bpftool to load all object assets, from Wang Yufen. 9) Various memory handling fixes for libbpf and BPF selftests, from Xu Kuohai. 10) Initial support for BPF selftest's vmtest.sh on arm64, from Manu Bretelle. 11) Improve libbpf's BTF handling to dedup identical structs, from Alan Maguire. 12) Add BPF CI and denylist documentation for BPF selftests, from Daniel Müller. 13) Check BPF cpumap max_entries before doing allocation work, from Florian Lehner. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (70 commits) samples/bpf: Fix typo in README bpf: Remove the obsolte u64_stats_fetch_*_irq() users. bpf: check max_entries before allocating memory bpf: Fix a typo in comment for DFS algorithm bpftool: Fix spelling mistake "disasembler" -> "disassembler" selftests/bpf: Fix bpftool synctypes checking failure selftests/bpf: Panic on hard/soft lockup docs/bpf: Add documentation for new cgroup local storage selftests/bpf: Add test cgrp_local_storage to DENYLIST.s390x selftests/bpf: Add selftests for new cgroup local storage selftests/bpf: Fix test test_libbpf_str/bpf_map_type_str bpftool: Support new cgroup local storage libbpf: Support new cgroup local storage bpf: Implement cgroup storage available to non-cgroup-attached bpf progs bpf: Refactor some inode/task/sk storage functions for reuse bpf: Make struct cgroup btf id global selftests/bpf: Tracing prog can still do lookup under busy lock selftests/bpf: Ensure no task storage failure for bpf_lsm.s prog due to deadlock detection bpf: Add new bpf_task_storage_delete proto with no deadlock detection bpf: bpf_task_storage_delete_recur does lookup first before the deadlock check ... ==================== Link: https://lore.kernel.org/r/20221102062120.5724-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-02blk-mq: add tagset quiesce interfaceChao Leng
Drivers that have shared tagsets may need to quiesce potentially a lot of request queues that all share a single tagset (e.g. nvme). Add an interface to quiesce all the queues on a given tagset. This interface is useful because it can speedup the quiesce by doing it in parallel. Because some queues should not need to be quiesced (e.g. the nvme connect_q) when quiescing the tagset, introduce a QUEUE_FLAG_SKIP_TAGSET_QUIESCE flag to allow this new interface to ski quiescing a particular queue. Signed-off-by: Chao Leng <lengchao@huawei.com> [hch: simplify for the per-tag_set srcu_struct] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Chao Leng <lengchao@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20221101150050.3510-14-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-02blk-mq: pass a tagset to blk_mq_wait_quiesce_doneChristoph Hellwig
Nothing in blk_mq_wait_quiesce_done needs the request_queue now, so just pass the tagset, and move the non-mq check into the only caller that needs it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chao Leng <lengchao@huawei.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20221101150050.3510-13-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-02blk-mq: move the srcu_struct used for quiescing to the tagsetChristoph Hellwig
All I/O submissions have fairly similar latencies, and a tagset-wide quiesce is a fairly common operation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Chao Leng <lengchao@huawei.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20221101150050.3510-12-hch@lst.de [axboe: fix whitespace] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-02memory: omap-gpmc: wait pin additionsBenedikt Niedermayr
This patch introduces support for setting the wait-pin polarity as well as using the same wait-pin for different CS regions. The waitpin polarity can be configured via the WAITPIN<X>POLARITY bits in the GPMC_CONFIG register. This is currently not supported by the driver. This patch adds support for setting the required register bits with the "ti,wait-pin-polarity" dt-property. The wait-pin can also be shared between different CS regions for special usecases. Therefore GPMC must keep track of wait-pin allocations, so it knows that either GPMC itself or another driver has the ownership. Signed-off-by: Benedikt Niedermayr <benedikt.niedermayr@siemens.com> Link: https://lore.kernel.org/r/20221102133047.1654449-2-benedikt.niedermayr@siemens.com Reviewed-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
2022-11-02net: wwan: iosm: add rpc interface for xmm modemsShane Parslow
Add a new iosm wwan port that connects to the modem rpc interface. This interface provides a configuration channel, and in the case of the 7360, is the only way to configure the modem (as it does not support mbim). The new interface is compatible with existing software, such as open_xdatachannel.py from the xmm7360-pci project [1]. [1] https://github.com/xmm7360/xmm7360-pci Signed-off-by: Shane Parslow <shaneparslow808@gmail.com> Reviewed-by: Loic Poulain <loic.poulain@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-02device property: Introduce fwnode_device_is_compatible() helperAndy Shevchenko
The fwnode_device_is_compatible() helper searches for the given string in the "compatible" string array property and, if found, returns true. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
2022-11-02cpufreq: Generalize of_perf_domain_get_sharing_cpumask phandle formatHector Martin
of_perf_domain_get_sharing_cpumask currently assumes a 1-argument phandle format, and directly returns the argument. Generalize this to return the full of_phandle_args, so it can be used by drivers which use other phandle styles (e.g. separate nodes). This also requires changing the CPU sharing match to compare the full args structure. Also, make sure to of_node_put(args.np) (the original code was leaking a reference). Signed-off-by: Hector Martin <marcan@marcan.st> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2022-11-01Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "x86: - fix lock initialization race in gfn-to-pfn cache (+selftests) - fix two refcounting errors - emulator fixes - mask off reserved bits in CPUID - fix bug with disabling SGX RISC-V: - update MAINTAINERS" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86/xen: Fix eventfd error handling in kvm_xen_eventfd_assign() KVM: x86: smm: number of GPRs in the SMRAM image depends on the image format KVM: x86: emulator: update the emulation mode after CR0 write KVM: x86: emulator: update the emulation mode after rsm KVM: x86: emulator: introduce emulator_recalc_and_set_mode KVM: x86: emulator: em_sysexit should update ctxt->mode KVM: selftests: Mark "guest_saw_irq" as volatile in xen_shinfo_test KVM: selftests: Add tests in xen_shinfo_test to detect lock races KVM: Reject attempts to consume or refresh inactive gfn_to_pfn_cache KVM: Initialize gfn_to_pfn_cache locks in dedicated helper KVM: VMX: fully disable SGX if SECONDARY_EXEC_ENCLS_EXITING unavailable KVM: x86: Exempt pending triple fault from event injection sanity check MAINTAINERS: git://github -> https://github.com for kvm-riscv KVM: debugfs: Return retval of simple_attr_open() if it fails KVM: x86: Reduce refcount if single_open() fails in kvm_mmu_rmaps_stat_open() KVM: x86: Mask off reserved bits in CPUID.8000001FH KVM: x86: Mask off reserved bits in CPUID.8000001AH KVM: x86: Mask off reserved bits in CPUID.80000008H KVM: x86: Mask off reserved bits in CPUID.80000006H KVM: x86: Mask off reserved bits in CPUID.80000001H
2022-11-01spi: introduce new helpers with using modern namingYang Yingliang
For using modern names host/target to instead of all the legacy names, I think it takes 3 steps: - step1: introduce new helpers with modern naming. - step2: switch to use these new helpers in all drivers. - step3: remove all legacy helpers and update all legacy names. This patch is for step1, it introduces new helpers with host/target naming for drivers using. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221011092204.950288-1-yangyingliang@huawei.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-10-31rtnetlink: pass netlink message header and portid to rtnl_configure_link()Hangbin Liu
This patch pass netlink message header and portid to rtnl_configure_link() All the functions in this call chain need to add the parameters so we can use them in the last call rtnl_notify(), and notify the userspace about the new link info if NLM_F_ECHO flag is set. - rtnl_configure_link() - __dev_notify_flags() - rtmsg_ifinfo() - rtmsg_ifinfo_event() - rtmsg_ifinfo_build_skb() - rtmsg_ifinfo_send() - rtnl_notify() Also move __dev_notify_flags() declaration to net/core/dev.h, as Jakub suggested. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-31cgroup: cgroup refcnt functions should be exported when CONFIG_DEBUG_CGROUP_REFTejun Heo
6ab428604f72 ("cgroup: Implement DEBUG_CGROUP_REF") added a config option which forces cgroup refcnt functions to be not inlined so that they can be kprobed for debugging. However, it forgot export them when the config is enabled breaking modules which make use of css reference counting. Fix it by adding CGROUP_REF_EXPORT() macro to cgroup_refcnt.h which is defined to EXPORT_SYMBOL_GPL when CONFIG_DEBUG_CGROUP_REF is set. Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: 6ab428604f72 ("cgroup: Implement DEBUG_CGROUP_REF")
2022-10-31fs: introduce dedicated idmap type for mountsChristian Brauner
Last cycle we've already made the interaction with idmapped mounts more robust and type safe by introducing the vfs{g,u}id_t type. This cycle we concluded the conversion and removed the legacy helpers. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate filesystem and mount namespaces and what different roles they have to play. Especially for filesystem developers without much experience in this area this is an easy source for bugs. Instead of passing the plain namespace we introduce a dedicated type struct mnt_idmap and replace the pointer with a pointer to a struct mnt_idmap. There are no semantic or size changes for the mount struct caused by this. We then start converting all places aware of idmapped mounts to rely on struct mnt_idmap. Once the conversion is done all helpers down to the really low-level make_vfs{g,u}id() and from_vfs{g,u}id() will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two, removing and thus eliminating the possibility of any bugs. Fwiw, I fixed some issues in that area a while ago in ntfs3 and ksmbd in the past. Afterwards, only low-level code can ultimately use the associated namespace for any permission checks. Even most of the vfs can be ultimately completely oblivious about this and filesystems will never interact with it directly in any form in the future. A struct mnt_idmap currently encompasses a simple refcount and a pointer to the relevant namespace the mount is idmapped to. If a mount isn't idmapped then it will point to a static nop_mnt_idmap. If it is an idmapped mount it will point to a new struct mnt_idmap. As usual there are no allocations or anything happening for non-idmapped mounts. Everthing is carefully written to be a nop for non-idmapped mounts as has always been the case. If an idmapped mount or mount tree is created a new struct mnt_idmap is allocated and a reference taken on the relevant namespace. For each mount in a mount tree that gets idmapped or a mount that inherits the idmap when it is cloned the reference count on the associated struct mnt_idmap is bumped. Just a reminder that we only allow a mount to change it's idmapping a single time and only if it hasn't already been attached to the filesystems and has no active writers. The actual changes are fairly straightforward. This will have huge benefits for maintenance and security in the long run even if it causes some churn. I'm aware that there's some cost for all of you. And I'll commit to doing this work and make this as painless as I can. Note that this also makes it possible to extend struct mount_idmap in the future. For example, it would be possible to place the namespace pointer in an anonymous union together with an idmapping struct. This would allow us to expose an api to userspace that would let it specify idmappings directly instead of having to go through the detour of setting up namespaces at all. This just adds the infrastructure and doesn't do any conversions. Reviewed-by: Seth Forshee (DigitalOcean) <sforshee@kernel.org> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2022-10-31block: simplify blksize_bits() implementationDawei Li
Convert current looping-based implementation into bit operation, which can bring improvement for: 1) bitops is more efficient for its arch-level optimization. 2) Given that blksize_bits() is inline, _if_ @size is compile-time constant, it's possible that order_base_2() _may_ make output compile-time evaluated, depending on code context and compiler behavior. Signed-off-by: Dawei Li <set_pte_at@outlook.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/TYCP286MB23238842958D7C083D6B67CECA349@TYCP286MB2323.JPNP286.PROD.OUTLOOK.COM Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-10-31ptp: introduce helpers to adjust by scaled parts per millionJacob Keller
Many drivers implement the .adjfreq or .adjfine PTP op function with the same basic logic: 1. Determine a base frequency value 2. Multiply this by the abs() of the requested adjustment, then divide by the appropriate divisor (1 billion, or 65,536 billion). 3. Add or subtract this difference from the base frequency to calculate a new adjustment. A few drivers need the difference and direction rather than the combined new increment value. I recently converted the Intel drivers to .adjfine and the scaled parts per million (65.536 parts per billion) logic. To avoid overflow with minimal loss of precision, mul_u64_u64_div_u64 was used. The basic logic used by all of these drivers is very similar, and leads to a lot of duplicate code to perform the same task. Rather than keep this duplicate code, introduce diff_by_scaled_ppm and adjust_by_scaled_ppm. These helper functions calculate the difference or adjustment necessary based on the scaled parts per million input. The diff_by_scaled_ppm function returns true if the difference should be subtracted, and false otherwise. Update the Intel drivers to use the new helper functions. Other vendor drivers will be converted to .adjfine and this helper function in the following changes. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-31ptp: add missing documentation for parametersJacob Keller
The ptp_find_pin_unlocked function and the ptp_system_timestamp structure didn't document their parameters and fields. Fix this. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-30net: remove unused netdev_unregistering()Juhee Kang
Currently, use dev->reg_state == NETREG_UNREGISTERING to check the status which is NETREG_UNREGISTERING, rather than using netdev_unregistering. Also, A helper function which is netdev_unregistering on nedevice.h is no longer used. Thus, netdev_unregistering removes from netdevice.h. Signed-off-by: Juhee Kang <claudiajkang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-30Merge tag 'fbdev-for-6.1-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev Pull fbdev fixes from Helge Deller: "A use-after-free bugfix in the smscufx driver and various minor error path fixes, smaller build fixes, sysfs fixes and typos in comments in the stifb, sisfb, da8xxfb, xilinxfb, sm501fb, gbefb and cyber2000fb drivers" * tag 'fbdev-for-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev: fbdev: cyber2000fb: fix missing pci_disable_device() fbdev: sisfb: use explicitly signed char fbdev: smscufx: Fix several use-after-free bugs fbdev: xilinxfb: Make xilinxfb_release() return void fbdev: sisfb: fix repeated word in comment fbdev: gbefb: Convert sysfs snprintf to sysfs_emit fbdev: sm501fb: Convert sysfs snprintf to sysfs_emit fbdev: stifb: Fall back to cfb_fillrect() on 32-bit HCRX cards fbdev: da8xx-fb: Fix error handling in .remove() fbdev: MIPS supports iomem addresses
2022-10-30Merge tag 'char-misc-6.1-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc fixes from Greg KH: "Some small driver fixes for 6.1-rc3. They include: - iio driver bugfixes - counter driver bugfixes - coresight bugfixes, including a revert and then a second fix to get it right. All of these have been in linux-next with no reported problems" * tag 'char-misc-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (21 commits) misc: sgi-gru: use explicitly signed char coresight: cti: Fix hang in cti_disable_hw() Revert "coresight: cti: Fix hang in cti_disable_hw()" counter: 104-quad-8: Fix race getting function mode and direction counter: microchip-tcb-capture: Handle Signal1 read and Synapse coresight: cti: Fix hang in cti_disable_hw() coresight: Fix possible deadlock with lock dependency counter: ti-ecap-capture: fix IS_ERR() vs NULL check counter: Reduce DEFINE_COUNTER_ARRAY_POLARITY() to defining counter_array iio: bmc150-accel-core: Fix unsafe buffer attributes iio: adxl367: Fix unsafe buffer attributes iio: adxl372: Fix unsafe buffer attributes iio: at91-sama5d2_adc: Fix unsafe buffer attributes iio: temperature: ltc2983: allocate iio channels once tools: iio: iio_utils: fix digit calculation iio: adc: stm32-adc: fix channel sampling time init iio: adc: mcp3911: mask out device ID in debug prints iio: adc: mcp3911: use correct id bits iio: adc: mcp3911: return proper error code on failure to allocate trigger iio: adc: mcp3911: fix sizeof() vs ARRAY_SIZE() bug ...
2022-10-30sched/psi: Use task->psi_flags to clear in CPU migrationChengming Zhou
The commit d583d360a620 ("psi: Fix psi state corruption when schedule() races with cgroup move") fixed a race problem by making cgroup_move_task() use task->psi_flags instead of looking at the scheduler state. We can extend task->psi_flags usage to CPU migration, which should be a minor optimization for performance and code simplicity. Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Link: https://lore.kernel.org/r/20220926081931.45420-1-zhouchengming@bytedance.com
2022-10-30sched/psi: Stop relying on timer_pending() for poll_work reschedulingSuren Baghdasaryan
Psi polling mechanism is trying to minimize the number of wakeups to run psi_poll_work and is currently relying on timer_pending() to detect when this work is already scheduled. This provides a window of opportunity for psi_group_change to schedule an immediate psi_poll_work after poll_timer_fn got called but before psi_poll_work could reschedule itself. Below is the depiction of this entire window: poll_timer_fn wake_up_interruptible(&group->poll_wait); psi_poll_worker wait_event_interruptible(group->poll_wait, ...) psi_poll_work psi_schedule_poll_work if (timer_pending(&group->poll_timer)) return; ... mod_timer(&group->poll_timer, jiffies + delay); Prior to 461daba06bdc we used to rely on poll_scheduled atomic which was reset and set back inside psi_poll_work and therefore this race window was much smaller. The larger window causes increased number of wakeups and our partners report visible power regression of ~10mA after applying 461daba06bdc. Bring back the poll_scheduled atomic and make this race window even narrower by resetting poll_scheduled only when we reach polling expiration time. This does not completely eliminate the possibility of extra wakeups caused by a race with psi_group_change however it will limit it to the worst case scenario of one extra wakeup per every tracking window (0.5s in the worst case). This patch also ensures correct ordering between clearing poll_scheduled flag and obtaining changed_states using memory barrier. Correct ordering between updating changed_states and setting poll_scheduled is ensured by atomic_xchg operation. By tracing the number of immediate rescheduling attempts performed by psi_group_change and the number of these attempts being blocked due to psi monitor being already active, we can assess the effects of this change: Before the patch: Run#1 Run#2 Run#3 Immediate reschedules attempted: 684365 1385156 1261240 Immediate reschedules blocked: 682846 1381654 1258682 Immediate reschedules (delta): 1519 3502 2558 Immediate reschedules (% of attempted): 0.22% 0.25% 0.20% After the patch: Run#1 Run#2 Run#3 Immediate reschedules attempted: 882244 770298 426218 Immediate reschedules blocked: 881996 769796 426074 Immediate reschedules (delta): 248 502 144 Immediate reschedules (% of attempted): 0.03% 0.07% 0.03% The number of non-blocked immediate reschedules dropped from 0.22-0.25% to 0.03-0.07%. The drop is attributed to the decrease in the race window size and the fact that we allow this race only when psi monitors reach polling window expiration time. Fixes: 461daba06bdc ("psi: eliminate kthread_worker from psi trigger scheduling mechanism") Reported-by: Kathleen Chang <yt.chang@mediatek.com> Reported-by: Wenju Xu <wenju.xu@mediatek.com> Reported-by: Jonathan Chen <jonathan.jmchen@mediatek.com> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Tested-by: SH Chen <show-hong.chen@mediatek.com> Link: https://lore.kernel.org/r/20221028194541.813985-1-surenb@google.com