summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2024-12-16net/mlx5: Add device cap abs_native_port_numRongwei Liu
When the abs_native_port_num is set, the native_port_num reported by the device may not be continuous and bigger than the num_lag_ports. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20241212221329.961628-2-tariqt@nvidia.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-12-15selinux: add netlink nlmsg_type audit messageThiébaud Weksteen
Add a new audit message type to capture nlmsg-related information. This is similar to LSM_AUDIT_DATA_IOCTL_OP which was added for the other SELinux extended permission (ioctl). Adding a new type is preferred to adding to the existing lsm_network_audit structure which contains irrelevant information for the netlink sockets (i.e., dport, sport). Signed-off-by: Thiébaud Weksteen <tweek@google.com> [PM: change "nlnk-msgtype" to "nl-msgtype" as discussed] Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-12-15Merge tag 'sched_urgent_for_v6.13_rc3-p2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Borislav Petkov: - Prevent incorrect dequeueing of the deadline dlserver helper task and fix its time accounting - Properly track the CFS runqueue runnable stats - Check the total number of all queued tasks in a sched fair's runqueue hierarchy before deciding to stop the tick - Fix the scheduling of the task that got woken last (NEXT_BUDDY) by preventing those from being delayed * tag 'sched_urgent_for_v6.13_rc3-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/dlserver: Fix dlserver time accounting sched/dlserver: Fix dlserver double enqueue sched/eevdf: More PELT vs DELAYED_DEQUEUE sched/fair: Fix sched_can_stop_tick() for fair tasks sched/fair: Fix NEXT_BUDDY
2024-12-15netlink: add IGMP/MLD join/leave notificationsYuyang Huang
This change introduces netlink notifications for multicast address changes. The following features are included: * Addition and deletion of multicast addresses are reported using RTM_NEWMULTICAST and RTM_DELMULTICAST messages with AF_INET and AF_INET6. * Two new notification groups: RTNLGRP_IPV4_MCADDR and RTNLGRP_IPV6_MCADDR are introduced for receiving these events. This change allows user space applications (e.g., ip monitor) to efficiently track multicast group memberships by listening for netlink events. Previously, applications relied on inefficient polling of procfs, introducing delays. With netlink notifications, applications receive realtime updates on multicast group membership changes, enabling more precise metrics collection and system monitoring.  This change also unlocks the potential for implementing a wide range of sophisticated multicast related features in user space by allowing applications to combine kernel provided multicast address information with user space data and communicate decisions back to the kernel for more fine grained control. This mechanism can be used for various purposes, including multicast filtering, IGMP/MLD offload, and IGMP/MLD snooping. Cc: Maciej Żenczykowski <maze@google.com> Cc: Lorenzo Colitti <lorenzo@google.com> Co-developed-by: Patrick Ruddy <pruddy@vyatta.att-mail.com> Signed-off-by: Patrick Ruddy <pruddy@vyatta.att-mail.com> Link: https://lore.kernel.org/r/20180906091056.21109-1-pruddy@vyatta.att-mail.com Signed-off-by: Yuyang Huang <yuyanghuang@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-14power: supply: core: add UAPI to discover currently used extensionsThomas Weißschuh
Userspace wants to now about the used power supply extensions, for example to handle a device extended by a certain extension differently or to discover information about the extending device. Add a sysfs directory to the power supply device. This directory contains links which are named after the used extension and point to the device implementing that extension. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Armin Wolf <W_Armin@gmx.de> Link: https://lore.kernel.org/r/20241211-power-supply-extensions-v6-4-9d9dc3f3d387@weissschuh.net Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2024-12-14Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfLinus Torvalds
Pull bpf fixes from Daniel Borkmann: - Fix a bug in the BPF verifier to track changes to packet data property for global functions (Eduard Zingerman) - Fix a theoretical BPF prog_array use-after-free in RCU handling of __uprobe_perf_func (Jann Horn) - Fix BPF tracing to have an explicit list of tracepoints and their arguments which need to be annotated as PTR_MAYBE_NULL (Kumar Kartikeya Dwivedi) - Fix a logic bug in the bpf_remove_insns code where a potential error would have been wrongly propagated (Anton Protopopov) - Avoid deadlock scenarios caused by nested kprobe and fentry BPF programs (Priya Bala Govindasamy) - Fix a bug in BPF verifier which was missing a size check for BTF-based context access (Kumar Kartikeya Dwivedi) - Fix a crash found by syzbot through an invalid BPF prog_array access in perf_event_detach_bpf_prog (Jiri Olsa) - Fix several BPF sockmap bugs including a race causing a refcount imbalance upon element replace (Michal Luczaj) - Fix a use-after-free from mismatching BPF program/attachment RCU flavors (Jann Horn) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: (23 commits) bpf: Avoid deadlock caused by nested kprobe and fentry bpf programs selftests/bpf: Add tests for raw_tp NULL args bpf: Augment raw_tp arguments with PTR_MAYBE_NULL bpf: Revert "bpf: Mark raw_tp arguments with PTR_MAYBE_NULL" selftests/bpf: Add test for narrow ctx load for pointer args bpf: Check size for BTF-based ctx access of pointer members selftests/bpf: extend changes_pkt_data with cases w/o subprograms bpf: fix null dereference when computing changes_pkt_data of prog w/o subprogs bpf: Fix theoretical prog_array UAF in __uprobe_perf_func() bpf: fix potential error return selftests/bpf: validate that tail call invalidates packet pointers bpf: consider that tail calls invalidate packet pointers selftests/bpf: freplace tests for tracking of changes_packet_data bpf: check changes_pkt_data property for extension programs selftests/bpf: test for changing packet data from global functions bpf: track changes_pkt_data property for global functions bpf: refactor bpf_helper_changes_pkt_data to use helper number bpf: add find_containing_subprog() utility function bpf,perf: Fix invalid prog_array access in perf_event_detach_bpf_prog bpf: Fix UAF via mismatching bpf_prog/attachment RCU flavors ...
2024-12-14Merge branches 'fixes.2024.12.14a', 'rcutorture.2024.12.14a', ↵Uladzislau Rezki (Sony)
'srcu.2024.12.14a' and 'torture-test.2024.12.14a' into rcu-merge.2024.12.14a fixes.2024.12.14a: RCU fixes rcutorture.2024.12.14a: Torture-test updates srcu.2024.12.14a: SRCU updates torture-test.2024.12.14a: Adding an extra test, fixes
2024-12-14srcu: Fix typo s/srcu_check_read_flavor()/__srcu_check_read_flavor()/Paul E. McKenney
This commit fixes a typo in which a comment needed to have been updated from srcu_check_read_flavor() to __srcu_check_read_flavor(). Reported-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Closes: https://lore.kernel.org/all/b75d1fcd-6fcd-4619-bb5c-507fa599ee28@amd.com/ Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14srcu: Guarantee non-negative return value from srcu_read_lock()Paul E. McKenney
For almost 20 years, the int return value from srcu_read_lock() has been always either zero or one. This commit therefore documents the fact that it will be non-negative, and does the same for the underlying __srcu_read_lock(). [ paulmck: Apply Andrii Nakryiko feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14rcutorture: Use symbols for SRCU reader flavorsPaul E. McKenney
This commit converts rcutorture.c values for the reader_flavor module parameter from hexadecimal to the SRCU_READ_FLAVOR_* C-preprocessor macros. The actual modprobe or kernel-boot-parameter values for read_flavor must still be entered in hexadecimal. Link: https://lore.kernel.org/all/c48c9dca-fe07-4833-acaa-28c827e5a79e@amd.com/ Suggested-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14rcutorture: Check preemption for failing readerPaul E. McKenney
This commit checks to see if the RCU reader has been preempted within its read-side critical section for RCU flavors supporting this notion (currently only preemptible RCU). If such a preemption occurred, then this is printed at the end of the "Failure/close-call rcutorture reader segments" list at the end of the rcutorture run. [ paulmck: Apply kernel test robot feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Frederic Weisbecker <frederic@kernel.org> Tested-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14hwmon: (pmbus/core) improve handling of write protected regulatorsJerome Brunet
Writing PMBus protected registers does succeed from the smbus perspective, even if the write is ignored by the device and a communication fault is raised. This fault will silently be caught and cleared by pmbus irq if one has been registered. This means that the regulator call may return succeed although the operation was ignored. With this change, the operation which are not supported will be properly flagged as such and the regulator framework won't even try to execute them. Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> [groeck: Adjust to EXPORT_SYMBOL_NS_GPL API change] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-12-14thermal: core: Add stub for thermal_zone_device_update()Thomas Weißschuh
To simplify the !CONFIG_THERMAL case in the hwmon core, add a !CONFIG_THERMAL stub for thermal_zone_device_update(). Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-12-14torture: Add dowarn argument to torture_sched_setaffinity()Paul E. McKenney
Current use cases of torture_sched_setaffinity() are well served by its unconditional warning on error. However, an upcoming use case for a preemption kthread needs to avoid warnings that might otherwise arise when that kthread attempted to bind itself to a CPU on its way offline. This commit therefore adds a dowarn argument that, when false, suppresses the warning. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14exportfs: add open methodChristian Brauner
This allows filesystems such as pidfs to provide their custom open. Link: https://lore.kernel.org/r/20241129-work-pidfs-file_handle-v1-3-87d803a42495@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-12-14pseudofs: add support for export_opsErin Shepherd
Pseudo-filesystems might reasonably wish to implement the export ops (particularly for name_to_handle_at/open_by_handle_at); plumb this through pseudo_fs_context Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Erin Shepherd <erin.shepherd@e43.eu> Link: https://lore.kernel.org/r/20241113-pidfs_fh-v2-1-9a4d28155a37@e43.eu Link: https://lore.kernel.org/r/20241129-work-pidfs-file_handle-v1-1-87d803a42495@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-12-14pidfs: rework inode number allocationChristian Brauner
Recently we received a patchset that aims to enable file handle encoding and decoding via name_to_handle_at(2) and open_by_handle_at(2). A crucical step in the patch series is how to go from inode number to struct pid without leaking information into unprivileged contexts. The issue is that in order to find a struct pid the pid number in the initial pid namespace must be encoded into the file handle via name_to_handle_at(2). This can be used by containers using a separate pid namespace to learn what the pid number of a given process in the initial pid namespace is. While this is a weak information leak it could be used in various exploits and in general is an ugly wart in the design. To solve this problem a new way is needed to lookup a struct pid based on the inode number allocated for that struct pid. The other part is to remove the custom inode number allocation on 32bit systems that is also an ugly wart that should go away. So, a new scheme is used that I was discusssing with Tejun some time back. A cyclic ida is used for the lower 32 bits and a the high 32 bits are used for the generation number. This gives a 64 bit inode number that is unique on both 32 bit and 64 bit. The lower 32 bit number is recycled slowly and can be used to lookup struct pids. Link: https://lore.kernel.org/r/20241129-work-pidfs-v2-1-61043d66fbce@kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-12-14memory: omap-gpmc: deadcode a pair of functionsDr. David Alan Gilbert
gpmc_get_client_irq() last use was removed by commit ac28e47ccc3f ("ARM: OMAP2+: Remove legacy gpmc-nand.c") gpmc_ticks_to_ns() last use was removed by commit 2514830b8b8c ("ARM: OMAP2+: Remove gpmc-onenand") Remove them. gpmc_clk_ticks_to_ns() is now only used in some DEBUG code; move inside the ifdef to avoid unused warnings. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Roger Quadros <rogerq@kernel.org> Acked-by: Kevin Hilman <khilman@baylibre.com> Link: https://lore.kernel.org/r/20241211214227.107980-1-linux@treblig.org Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
2024-12-14power: supply: core: implement extension APIThomas Weißschuh
Various drivers, mostly in platform/x86 extend the ACPI battery driver with additional sysfs attributes to implement more UAPIs than are exposed through ACPI by using various side-channels, like WMI, nonstandard ACPI or EC communication. While the created sysfs attributes look similar to the attributes provided by the powersupply core, there are various deficiencies: * They don't show up in uevent payload. * They can't be queried with the standard in-kernel APIs. * They don't work with triggers. * The extending driver has to reimplement all of the parsing, formatting and sysfs display logic. * Writing a extension driver is completely different from writing a normal power supply driver. This extension API avoids all of these issues. An extension is just a "struct power_supply_ext" with the same kind of callbacks as in a normal "struct power_supply_desc". The API is meant to be used via battery_hook_register(), the same way as the current extensions. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Armin Wolf <W_Armin@gmx.de> Link: https://lore.kernel.org/r/20241211-power-supply-extensions-v6-1-9d9dc3f3d387@weissschuh.net Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2024-12-13bpf: Revert "bpf: Mark raw_tp arguments with PTR_MAYBE_NULL"Kumar Kartikeya Dwivedi
This patch reverts commit cb4158ce8ec8 ("bpf: Mark raw_tp arguments with PTR_MAYBE_NULL"). The patch was well-intended and meant to be as a stop-gap fixing branch prediction when the pointer may actually be NULL at runtime. Eventually, it was supposed to be replaced by an automated script or compiler pass detecting possibly NULL arguments and marking them accordingly. However, it caused two main issues observed for production programs and failed to preserve backwards compatibility. First, programs relied on the verifier not exploring == NULL branch when pointer is not NULL, thus they started failing with a 'dereference of scalar' error. Next, allowing raw_tp arguments to be modified surfaced the warning in the verifier that warns against reg->off when PTR_MAYBE_NULL is set. More information, context, and discusson on both problems is available in [0]. Overall, this approach had several shortcomings, and the fixes would further complicate the verifier's logic, and the entire masking scheme would have to be removed eventually anyway. Hence, revert the patch in preparation of a better fix avoiding these issues to replace this commit. [0]: https://lore.kernel.org/bpf/20241206161053.809580-1-memxor@gmail.com Reported-by: Manu Bretelle <chantra@meta.com> Fixes: cb4158ce8ec8 ("bpf: Mark raw_tp arguments with PTR_MAYBE_NULL") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20241213221929.3495062-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-12-13Merge tag 'block-6.13-20241213' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: - Series from Damien fixing issues with the zoned write plugging - Fix for a potential UAF in block cgroups - Fix deadlock around queue freezing and the sysfs lock - Various little cleanups and fixes * tag 'block-6.13-20241213' of git://git.kernel.dk/linux: block: Fix potential deadlock while freezing queue and acquiring sysfs_lock block: Fix queue_iostats_passthrough_show() blk-mq: Clean up blk_mq_requeue_work() mq-deadline: Remove a local variable blk-iocost: Avoid using clamp() on inuse in __propagate_weights() block: Make bio_iov_bvec_set() accept pointer to const iov_iter block: get wp_offset by bdev_offset_from_zone_start blk-cgroup: Fix UAF in blkcg_unpin_online() MAINTAINERS: update Coly Li's email address block: Prevent potential deadlocks in zone write plug error recovery dm: Fix dm-zoned-reclaim zone write pointer alignment block: Ignore REQ_NOWAIT for zone reset and zone finish operations block: Use a zone write plug BIO work for REQ_NOWAIT BIOs
2024-12-13bpf: Add a __btf_get_by_fd helperAnton Protopopov
Add a new helper to get a pointer to a struct btf from a file descriptor. This helper doesn't increase a refcnt. Add a comment explaining this and pointing to a corresponding function which does take a reference. Signed-off-by: Anton Protopopov <aspsk@isovalent.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241213130934.1087929-2-aspsk@isovalent.com
2024-12-13soc: mediatek: cmdq: Remove cmdq_pkt_finalize() helper functionChun-Kuang Hu
In order to have fine-grained control, use cmdq_pkt_eoc() and cmdq_pkt_jump_rel() to replace cmdq_pkt_finalize(). Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org> Acked-by: Matthias Brugger <matthias.bgg@gmail.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Sebastian Fricke <sebastian.fricke@collabora.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
2024-12-13kref: Improve documentationMatthew Wilcox (Oracle)
There is already kernel-doc written for many of the functions in kref.h but it's not linked into the html docs anywhere. Add it to kref.rst. Improve the kref documentation by using the standard Return: section, rewording some unclear verbiage and adding docs for some undocumented functions. Update Thomas' email address to his current one. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20241209160953.757673-1-willy@infradead.org
2024-12-13Merge tag 'ffa-fix-6.13' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes Arm FF-A fix for v6.13 A single fix to address a possible race around setting ffa_dev->properties in ffa_device_register() by updating ffa_device_register() to take all the partition information received from the firmware and updating the struct ffa_device accordingly before registering the device to the bus/driver model in the kernel. * tag 'ffa-fix-6.13' of https://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: firmware: arm_ffa: Fix the race around setting ffa_dev->properties Link: https://lore.kernel.org/r/20241210101113.3232602-1-sudeep.holla@arm.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-12-13firmware: cs_dsp: Add mock bin file generator for KUnit testingRichard Fitzgerald
Add a mock firmware file that emulates what the firmware build tools would normally create. This will be used by KUnit tests to generate a test bin file. The data payload in a bin is an opaque blob, so the mock bin only needs to generate the appropriate file header and description block for each payload blob. Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Link: https://patch.msgid.link/20241212143725.1381013-5-rf@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-12-13firmware: cs_dsp: Add mock wmfw file generator for KUnit testingRichard Fitzgerald
Add a mock firmware file that emulates what the firmware build tools would normally create. This will be used by KUnit tests to generate a test wmfw file. Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Link: https://patch.msgid.link/20241212143725.1381013-4-rf@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-12-13firmware: cs_dsp: Add mock DSP memory map for KUnit testingRichard Fitzgerald
Add helper functions to implement an emulation of the DSP memory map. There are three main groups of functionality: 1. Define a mock cs_dsp_region table. 2. Calculate the addresses of memory and algorithms from the firmware header in XM. 3. Build a mock XM header in emulated XM. Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Link: https://patch.msgid.link/20241212143725.1381013-3-rf@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-12-13firmware: cs_dsp: Add mock regmap for KUnit testingRichard Fitzgerald
Add a mock regmap implementation to act as a simulated DSP for KUnit testing. This is built as a utility module so that it could be used by clients of cs_dsp to create a mock "DSP" for their own testing. cs_dsp interacts with the DSP only through registers. Most of the register space of the DSP is RAM. ADSP cores have a small set of control registers. HALO Core DSPs have a much larger set of control registers but only a small subset are used. Most writes are "blind" in the sense that cs_dsp does not expect to receive any sort of response from the DSP. So there isn't any need to emulate a "DSP", only a set of registers that can be written and read back. The idea of the mock regmap is to use the cache to accumulate writes which can then be tested against the values that are expected to be in the registers. Stray writes can be detected by dropping the cache entries for all addresses that should have been written and then issuing a regcache_sync(). If this causes bus writes it means there were writes to unexpected registers. Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Link: https://patch.msgid.link/20241212143725.1381013-2-rf@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-12-13sched/dlserver: Fix dlserver double enqueueVineeth Pillai (Google)
dlserver can get dequeued during a dlserver pick_task due to the delayed deueue feature and this can lead to issues with dlserver logic as it still thinks that dlserver is on the runqueue. The dlserver throttling and replenish logic gets confused and can lead to double enqueue of dlserver. Double enqueue of dlserver could happend due to couple of reasons: Case 1 ------ Delayed dequeue feature[1] can cause dlserver being stopped during a pick initiated by dlserver: __pick_next_task pick_task_dl -> server_pick_task pick_task_fair pick_next_entity (if (sched_delayed)) dequeue_entities dl_server_stop server_pick_task goes ahead with update_curr_dl_se without knowing that dlserver is dequeued and this confuses the logic and may lead to unintended enqueue while the server is stopped. Case 2 ------ A race condition between a task dequeue on one cpu and same task's enqueue on this cpu by a remote cpu while the lock is released causing dlserver double enqueue. One cpu would be in the schedule() and releasing RQ-lock: current->state = TASK_INTERRUPTIBLE(); schedule(); deactivate_task() dl_stop_server(); pick_next_task() pick_next_task_fair() sched_balance_newidle() rq_unlock(this_rq) at which point another CPU can take our RQ-lock and do: try_to_wake_up() ttwu_queue() rq_lock() ... activate_task() dl_server_start() --> first enqueue wakeup_preempt() := check_preempt_wakeup_fair() update_curr() update_curr_task() if (current->dl_server) dl_server_update() enqueue_dl_entity() --> second enqueue This bug was not apparent as the enqueue in dl_server_start doesn't usually happen because of the defer logic. But as a side effect of the first case(dequeue during dlserver pick), dl_throttled and dl_yield will be set and this causes the time accounting of dlserver to messup and then leading to a enqueue in dl_server_start. Have an explicit flag representing the status of dlserver to avoid the confusion. This is set in dl_server_start and reset in dlserver_stop. Fixes: 63ba8422f876 ("sched/deadline: Introduce deadline servers") Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: "Vineeth Pillai (Google)" <vineeth@bitbyteword.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Marcel Ziswiler <marcel.ziswiler@codethink.co.uk> # ROCK 5B Link: https://lkml.kernel.org/r/20241213032244.877029-1-vineeth@bitbyteword.org
2024-12-13coresight: Add support for trace filtering by sourceTao Zhang
Some replicators have hard coded filtering of "trace" data, based on the source device. This is different from the trace filtering based on TraceID, available in the standard programmable replicators. e.g., Qualcomm replicators have filtering based on custom trace protocol format and is not programmable. The source device could be connected to the replicator via intermediate components (e.g., a funnel). Thus we need platform information from the firmware tables to decide the source device corresponding to a given output port from the replicator. Given this affects "trace path building" and traversing the path back from the sink to source, add the concept of "filtering by source" to the generic coresight connection. The specified source will be marked like below in the Devicetree. test-replicator { ... ... ... ... out-ports { ... ... ... ... port@0 { reg = <0>; xyz: endpoint { remote-endpoint = <&zyx>; filter-source = <&source_1>; <-- To specify the source to }; be filtered out here. }; port@1 { reg = <1>; abc: endpoint { remote-endpoint = <&cba>; filter-source = <&source_2>; <-- To specify the source to }; be filtered out here. }; }; }; Signed-off-by: Tao Zhang <quic_taozha@quicinc.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20241213100731.25914-4-quic_taozha@quicinc.com
2024-12-13coresight: Add a helper to check if a device is sourceTao Zhang
Since there are a lot of places in the code to check whether the device is source, add a helper to check it. Signed-off-by: Tao Zhang <quic_taozha@quicinc.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20241213100731.25914-3-quic_taozha@quicinc.com
2024-12-13x86/static-call: provide a way to do very early static-call updatesJuergen Gross
Add static_call_update_early() for updating static-call targets in very early boot. This will be needed for support of Xen guest type specific hypercall functions. This is part of XSA-466 / CVE-2024-53241. Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Co-developed-by: Peter Zijlstra <peterz@infradead.org> Co-developed-by: Josh Poimboeuf <jpoimboe@redhat.com>
2024-12-12skbuff: allow 2-4-argument skb_frag_dma_map()Alexander Lobakin
skb_frag_dma_map(dev, frag, 0, skb_frag_size(frag), DMA_TO_DEVICE) is repeated across dozens of drivers and really wants a shorthand. Add a macro which will count args and handle all possible number from 2 to 5. Semantics: skb_frag_dma_map(dev, frag) -> __skb_frag_dma_map(dev, frag, 0, skb_frag_size(frag), DMA_TO_DEVICE) skb_frag_dma_map(dev, frag, offset) -> __skb_frag_dma_map(dev, frag, offset, skb_frag_size(frag) - offset, DMA_TO_DEVICE) skb_frag_dma_map(dev, frag, offset, size) -> __skb_frag_dma_map(dev, frag, offset, size, DMA_TO_DEVICE) skb_frag_dma_map(dev, frag, offset, size, dir) -> __skb_frag_dma_map(dev, frag, offset, size, dir) No object code size changes for the existing callers. Users passing less arguments also won't have bigger size comparing to the full equivalent call. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20241211172649.761483-11-aleksander.lobakin@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-13power: supply: core: Add new "charge_types" propertyHans de Goede
Add a new "charge_types" property, this is identical to "charge_type" but reading returns a list of supported charge-types with the currently active type surrounded by square brackets, e.g.: Fast [Standard] "Long_Life" This has the advantage over the existing "charge_type" property that this allows userspace to find out which charge-types are supported for writable charge_type properties. Drivers which already support "charge_type" can easily add support for this by setting power_supply_desc.charge_types to a bitmask representing valid charge_type values. The existing "charge_type" get_property() and set_property() code paths can be re-used for "charge_types". Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20241211174451.355421-3-hdegoede@redhat.com Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2024-12-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.13-rc3). No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-12Merge tag 'net-6.13-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth, netfilter and wireless. Current release - fix to a fix: - rtnetlink: fix error code in rtnl_newlink() - tipc: fix NULL deref in cleanup_bearer() Current release - regressions: - ip: fix warning about invalid return from in ip_route_input_rcu() Current release - new code bugs: - udp: fix L4 hash after reconnect - eth: lan969x: fix cyclic dependency between modules - eth: bnxt_en: fix potential crash when dumping FW log coredump Previous releases - regressions: - wifi: mac80211: - fix a queue stall in certain cases of channel switch - wake the queues in case of failure in resume - splice: do not checksum AF_UNIX sockets - virtio_net: fix BUG()s in BQL support due to incorrect accounting of purged packets during interface stop - eth: - stmmac: fix TSO DMA API mis-usage causing oops - bnxt_en: fixes for HW GRO: GSO type on 5750X chips and oops due to incorrect aggregation ID mask on 5760X chips Previous releases - always broken: - Bluetooth: improve setsockopt() handling of malformed user input - eth: ocelot: fix PTP timestamping in presence of packet loss - ptp: kvm: x86: avoid "fail to initialize ptp_kvm" when simply not supported" * tag 'net-6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (81 commits) net: dsa: tag_ocelot_8021q: fix broken reception net: dsa: microchip: KSZ9896 register regmap alignment to 32 bit boundaries net: renesas: rswitch: fix initial MPIC register setting Bluetooth: btmtk: avoid UAF in btmtk_process_coredump Bluetooth: iso: Fix circular lock in iso_conn_big_sync Bluetooth: iso: Fix circular lock in iso_listen_bis Bluetooth: SCO: Add support for 16 bits transparent voice setting Bluetooth: iso: Fix recursive locking warning Bluetooth: iso: Always release hdev at the end of iso_listen_bis Bluetooth: hci_event: Fix using rcu_read_(un)lock while iterating Bluetooth: hci_core: Fix sleeping function called from invalid context team: Fix feature propagation of NETIF_F_GSO_ENCAP_ALL team: Fix initial vlan_feature set in __team_compute_features bonding: Fix feature propagation of NETIF_F_GSO_ENCAP_ALL bonding: Fix initial {vlan,mpls}_feature set in bond_compute_features net, team, bonding: Add netdev_base_features helper net/sched: netem: account for backlog updates from child qdisc net: dsa: felix: fix stuck CPU-injected packets with short taprio windows splice: do not checksum AF_UNIX sockets net: usb: qmi_wwan: add Telit FE910C04 compositions ...
2024-12-12PCI: endpoint: Replace magic number '6' by PCI_STD_NUM_BARSRick Wertenbroek
Replace the constant "6" by PCI_STD_NUM_BARS, as defined in include/uapi/linux/pci_regs.h: #define PCI_STD_NUM_BARS 6 /* Number of standard BARs */ Link: https://lore.kernel.org/r/20241212162547.225880-1-rick.wertenbroek@gmail.com Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-12-12turris-omnia-mcu-interface.h: Add LED commands related definitions to global ↵Marek Behún
header Add definitions for contents of the OMNIA_CMD_LED_MODE and OMNIA_CMD_LED_STATE commands to the global turris-omnia-mcu-interface.h header. Signed-off-by: Marek Behún <kabel@kernel.org> Link: https://lore.kernel.org/r/20241111100355.6978-4-kabel@kernel.org Signed-off-by: Lee Jones <lee@kernel.org>
2024-12-12turris-omnia-mcu-interface.h: Move command execution function to global headerMarek Behún
Move the command execution functions from the turris-omnia-mcu platform driver private header to the global turris-omnia-mcu-interface.h header, so that they can be used by the LED driver. Signed-off-by: Marek Behún <kabel@kernel.org> Link: https://lore.kernel.org/r/20241111100355.6978-2-kabel@kernel.org Signed-off-by: Lee Jones <lee@kernel.org>
2024-12-12block: Make bio_iov_bvec_set() accept pointer to const iov_iterJohn Garry
Make bio_iov_bvec_set() accept a pointer to const iov_iter, which means that we can drop the undesirable casting to struct iov_iter pointer in blk_rq_map_user_bvec(). Signed-off-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20241202115727.2320401-1-john.g.garry@oracle.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2024-12-12Merge branch 'platform-drivers-x86-platform-profile' into for-nextIlpo Järvinen
2024-12-12net, team, bonding: Add netdev_base_features helperDaniel Borkmann
Both bonding and team driver have logic to derive the base feature flags before iterating over their slave devices to refine the set via netdev_increment_features(). Add a small helper netdev_base_features() so this can be reused instead of having it open-coded multiple times. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Nikolay Aleksandrov <razor@blackwall.org> Cc: Ido Schimmel <idosch@idosch.org> Cc: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20241210141245.327886-1-daniel@iogearbox.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-12-11lib: packing: add pack_fields() and unpack_fields()Vladimir Oltean
This is new API which caters to the following requirements: - Pack or unpack a large number of fields to/from a buffer with a small code footprint. The current alternative is to open-code a large number of calls to pack() and unpack(), or to use packing() to reduce that number to half. But packing() is not const-correct. - Use unpacked numbers stored in variables smaller than u64. This reduces the rodata footprint of the stored field arrays. - Perform error checking at compile time, rather than runtime, and return void from the API functions. Because the C preprocessor can't generate variable length code (loops), this is a bit tricky to do with macros. To handle this, implement macros which sanity check the packed field definitions based on their size. Finally, a single macro with a chain of __builtin_choose_expr() is used to select the appropriate macros. We enforce the use of ascending or descending order to avoid O(N^2) scaling when checking for overlap. Note that the macros are written with care to ensure that the compilers can correctly evaluate the resulting code at compile time. In particular, care was taken with avoiding too many nested statement expressions. Nested statement expressions trip up some compilers, especially when passing down variables created in previous statement expressions. There are two key design choices intended to keep the overall macro code size small. First, the definition of each CHECK_PACKED_FIELDS_N macro is implemented recursively, by calling the N-1 macro. This avoids needing the code to repeat multiple times. Second, the CHECK_PACKED_FIELD macro enforces that the fields in the array are sorted in order. This allows checking for overlap only with neighboring fields, rather than the general overlap case where each field would need to be checked against other fields. The overlap checks use the first two fields to determine the order of the remaining fields, thus allowing either ascending or descending order. This enables drivers the flexibility to keep the fields ordered in which ever order most naturally fits their hardware design and its associated documentation. The CHECK_PACKED_FIELDS macro is directly called from within pack_fields and unpack_fields, ensuring that all drivers using the API receive the benefits of the compile-time checks. Users do not need to directly call any of the macros directly. The CHECK_PACKED_FIELDS and its helper macros CHECK_PACKED_FIELDS_(0..50) are generated using a simple C program in scripts/gen_packed_field_checks.c This program can be compiled on demand and executed to generate the macro code in include/linux/packing.h. This will aid in the event that a driver needs more than 50 fields. The generator can be updated with a new size, and used to update the packing.h header file. In practice, the ice driver will need to support 27 fields, and the sja1105 driver will need to support 0 fields. This on-demand generation avoids the need to modify Kbuild. We do not anticipate the maximum number of fields to grow very often. - Reduced rodata footprint for the storage of the packed field arrays. To that end, we have struct packed_field_u8 and packed_field_u16, which define the fields with the associated type. More can be added as needed (unlikely for now). On these types, the same generic pack_fields() and unpack_fields() API can be used, thanks to the new C11 _Generic() selection feature, which can call pack_fields_u8() or pack_fields_16(), depending on the type of the "fields" array - a simplistic form of polymorphism. It is evaluated at compile time which function will actually be called. Over time, packing() is expected to be completely replaced either with pack() or with pack_fields(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Co-developed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241210-packing-pack-fields-and-ice-implementation-v10-3-ee56a47479ac@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-11kexec: Consolidate machine_kexec_mask_interrupts() implementationEliav Farber
Consolidate the machine_kexec_mask_interrupts implementation into a common function located in a new file: kernel/irq/kexec.c. This removes duplicate implementations from architecture-specific files in arch/arm, arch/arm64, arch/powerpc, and arch/riscv, reducing code duplication and improving maintainability. The new implementation retains architecture-specific behavior for CONFIG_GENERIC_IRQ_KEXEC_CLEAR_VM_FORWARD, which was previously implemented for ARM64. When enabled (currently for ARM64), it clears the active state of interrupts forwarded to virtual machines (VMs) before handling other interrupt masking operations. Signed-off-by: Eliav Farber <farbere@amazon.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20241204142003.32859-2-farbere@amazon.com
2024-12-11iio: consumers: ensure read buffers for labels and ext_info are page alignedMatteo Martelli
Attributes of iio providers are exposed via sysfs. Typically, providers pass attribute values to the iio core, which handles formatting and printing to sysfs. However, some attributes, such as labels or extended info, are directly formatted and printed to sysfs by provider drivers using sysfs_emit() and sysfs_emit_at(). These helpers assume the read buffer, allocated by sysfs fop, is page-aligned. When these attributes are accessed by consumer drivers, the read buffer is allocated by the consumer and may not be page-aligned, leading to failures in the provider's callback that utilizes sysfs_emit*. Add a check to ensure that read buffers for labels and external info attributes are page-aligned. Update the prototype documentation as well. Signed-off-by: Matteo Martelli <matteomartelli3@gmail.com> Link: https://patch.msgid.link/20241202-iio-kmalloc-align-v1-1-aa9568c03937@gmail.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-12-11iio: adc: ad_sigma_delta: Store information about reset sequence lengthUwe Kleine-König
The various chips can be reset using a sequence of SPI transfers with MOSI = 1. The length of such a sequence varies from chip to chip. Store that length in struct ad_sigma_delta_info and replace the respective parameter to ad_sd_reset() with it. Note the ad7192 used to pass 48 as length but the documentation specifies 40 as the required length. Assuming the latter is right. (Using a too long sequence doesn't hurt apart from using a longer spi transfer than necessary, so this is no relevant fix.) The motivation for storing this information is that this is useful to clear a pending R̅D̅Y̅ signal in the next change. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Link: https://patch.msgid.link/9750db62fce638bf140ff48172c23bff7f785e5b.1733504533.git.u.kleine-koenig@baylibre.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-12-11iio: adc: ad_sigma_delta: Fix a race conditionUwe Kleine-König
The ad_sigma_delta driver helper uses irq_disable_nosync(). With that one it is possible that the irq handler still runs after the irq_disable_nosync() function call returns. Also to properly synchronize irq disabling in the different threads proper locking is needed and because it's unclear if the irq handler's irq_disable_nosync() call comes first or the one in the enabler's error path, all code locations that disable the irq must check for .irq_dis first to ensure there is exactly one disable call per enable call. So add a spinlock to the struct ad_sigma_delta and use it to synchronize irq enabling and disabling. Also only act in the irq handler if the irq is still enabled. Fixes: af3008485ea0 ("iio:adc: Add common code for ADI Sigma Delta devices") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Link: https://patch.msgid.link/9e6def47e2e773e0e15b7a2c29d22629b53d91b1.1733504533.git.u.kleine-koenig@baylibre.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-12-11iio: adc: ad_sigma_delta: Add support for reading irq status using a GPIOUwe Kleine-König
Some of the ADCs by Analog signal their irq condition on the MISO line. So typically that line is connected to an SPI controller and a GPIO. The GPIO is used as input and the respective interrupt is enabled when the last SPI transfer is completed. Depending on the GPIO controller the toggling MISO line might make the interrupt pending even while it's masked. In that case the irq handler is called immediately after irq_enable() and so before the device actually pulls that line low which results in non-sense values being reported to the upper layers. The only way to find out if the line was actually pulled low is to read the GPIO. (There is a flag in AD7124's status register that also signals if an interrupt was asserted, but reading that register toggles the MISO line and so might trigger another spurious interrupt.) Add the possibility to specify an interrupt GPIO in the machine description in addition to the plain interrupt. This GPIO is used then to check if the irq line is actually active in the irq handler. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Link: https://patch.msgid.link/5be9a4cc4dc600ec384c88db01dd661a21506b9c.1733504533.git.u.kleine-koenig@baylibre.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-12-11fs: don't block write during exec on pre-content watched filesAmir Goldstein
Commit 2a010c412853 ("fs: don't block i_writecount during exec") removed the legacy behavior of getting ETXTBSY on attempt to open and executable file for write while it is being executed. This commit was reverted because an application that depends on this legacy behavior was broken by the change. We need to allow HSM writing into executable files while executed to fill their content on-the-fly. To that end, disable the ETXTBSY legacy behavior for files that are watched by pre-content events. This change is not expected to cause regressions with existing systems which do not have any pre-content event listeners. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Acked-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20241128142532.465176-1-amir73il@gmail.com