summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2025-02-12scsi: ufs: core: Critical health conditionAvri Altman
Martin hi, The UFS4.1 standard, released on January 8, 2025, added a new exception event: HEALTH_CRITICAL, which notifies the host of a device's critical health condition. This notification implies that the device is approaching the end of its lifetime based on the amount of performed program/erase cycles. Once an EOL (End-of-Life) exception event is received, we increment a designated member, which is exposed via a sysfs entry. This new entry, will report the number of times a critical health event has been reported by a UFS device. To handle this new sysfs entry, userspace applications can use select(), poll(), or epoll() to monitor changes in the critical_health attribute. The kernel will call sysfs_notify() to signal changes, allowing the userspace application to detect and respond to these changes efficiently. The host can gain further insight into the specific issue by reading one of the following attributes: bPreEOLInfo, bDeviceLifeTimeEstA, bDeviceLifeTimeEstB, bWriteBoosterBufferLifeTimeEst, and bRPMBLifeTimeEst. All those are available for reading via the driver's sysfs entries or through an applicable utility. It is up to userspace to read these attributes if needed. Signed-off-by: Avri Altman <avri.altman@wdc.com> Link: https://lore.kernel.org/r/20250211065813.58091-1-avri.altman@wdc.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-12scsi: iscsi_tcp: Switch to using the crc32c libraryEric Biggers
Now that the crc32c() library function directly takes advantage of architecture-specific optimizations, it is unnecessary to go through the crypto API. Just use crc32c(). This is much simpler, and it improves performance due to eliminating the crypto API overhead. Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20250207041724.70733-1-ebiggers@kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-12net: phylink: provide phylink_mac_implements_lpi()Russell King (Oracle)
Provide a helper to determine whether the MAC operations structure implements the LPI operations, which will be used by both phylink and DSA. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1thR9g-003vX6-4s@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-12drm/amdgpu: Add flags to distinguish vf/pf/pt modeAsad Kamal
Add extra flag definition for ids_flag field to distinguish between vf/pf/pt modes v2: Updated kms driver minor version & removed pf check as default is 0 v3: Fix up version (Alex) v4: rebase (Alex) Proposed userspace: https://github.com/ROCm/amdsmi/commit/e663bed7d6b3df79f5959e73981749b1f22ec698 Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12net: report csum_complete via qstatsJakub Kicinski
Commit 13c7c941e729 ("netdev: add qstat for csum complete") reserved the entry for csum complete in the qstats uAPI. Start reporting this value now that we have a driver which needs it. Reviewed-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20250211181356.580800-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-13drm/i2c: tda998x: drop support for platform_dataDmitry Baryshkov
After the commit 0fb2970b4b6b ("drm/armada: remove non-component support") there are no remaining users of the struct tda998x_encoder_params. Drop the header and corresponding API from the TDA998x driver. Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250113-drm-move-tda998x-v3-1-214e0682a5e4@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
2025-02-12net: phy: Add support for driver-specific next update timeOleksij Rempel
Introduce the `phy_get_next_update_time` function to allow PHY drivers to dynamically determine the time (in jiffies) until the next state update event. This enables more flexible and adaptive polling intervals based on the link state or other conditions. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250210082358.200751-2-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-12spi: fix missing offload_flags docDavid Lechner
Add offload_flags to the documentation comment for struct spi_transfer. This was missed when adding the field. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/linux-next/20250212154356.784944ea@canb.auug.org.au/ Fixes: 700a281905f2 ("spi: add offload TX/RX streaming APIs") Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://patch.msgid.link/20250212-spi-offload-fixes-v1-1-e192c69e3bb3@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-02-12gpiolib: Simplify implementation of for_each_hwgpio_in_range()Andy Shevchenko
The whole purpose of the custom CLASS() is to have possibility to initialise the counter variable _i to 0. This can't be done with simple __free() macro as it will be not allowed by C language. OTOH, the CLASS() operates with the pointers and explicit usage of the scoped variable _data is not needed, since the pointers are kept the same over the iterations. Simplify the implementation of for_each_hwgpio_in_range(). Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20250207151149.2119765-3-andriy.shevchenko@linux.intel.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-02-12gpiolib: Deduplicate some code in for_each_requested_gpio_in_range()Andy Shevchenko
Refactor for_each_requested_gpio_in_range() to deduplicate some code which is basically repeats the for_each_hwgpio(). In order to achieve this, split the latter to two, for_each_hwgpio_in_range() and for_each_hwgpio(). Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20250207151149.2119765-2-andriy.shevchenko@linux.intel.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-02-12Merge patch series "fs: allow changing idmappings"Christian Brauner
Christian Brauner <brauner@kernel.org> says: Currently, it isn't possible to change the idmapping of an idmapped mount. This is becoming an obstacle for various use-cases. /* idmapped home directories with systemd-homed */ On newer systems /home is can be an idmapped mount such that each file on disk is owned by 65536 and a subfolder exists for foreign id ranges such as containers. For example, a home directory might look like this (using an arbitrary folder as an example): user1@localhost:~/data/mount-idmapped$ ls -al /data/ total 16 drwxrwxrwx 1 65536 65536 36 Jan 27 12:15 . drwxrwxr-x 1 root root 184 Jan 27 12:06 .. -rw-r--r-- 1 65536 65536 0 Jan 27 12:07 aaa -rw-r--r-- 1 65536 65536 0 Jan 27 12:07 bbb -rw-r--r-- 1 65536 65536 0 Jan 27 12:07 cc drwxr-xr-x 1 2147352576 2147352576 0 Jan 27 19:06 containers When logging in home is mounted as an idmapped mount with the following idmappings: 65536:$(id -u):1 // uid mapping 65536:$(id -g):1 // gid mapping 2147352576:2147352576:65536 // uid mapping 2147352576:2147352576:65536 // gid mapping So for a user with uid/gid 1000 an idmapped /home would like like this: user1@localhost:~/data/mount-idmapped$ ls -aln /mnt/ total 16 drwxrwxrwx 1 1000 1000 36 Jan 27 12:15 . drwxrwxr-x 1 0 0 184 Jan 27 12:06 .. -rw-r--r-- 1 1000 1000 0 Jan 27 12:07 aaa -rw-r--r-- 1 1000 1000 0 Jan 27 12:07 bbb -rw-r--r-- 1 1000 1000 0 Jan 27 12:07 cc drwxr-xr-x 1 2147352576 2147352576 0 Jan 27 19:06 containers In other words, 65536 is mapped to the user's uid/gid and the range 2147352576 up to 2147352576 + 65536 is an identity mapping for containers. When a container is started a transient uid/gid range is allocated outside of both mappings of the idmapped mount. For example, the container might get the idmapping: $ cat /proc/1742611/uid_map 0 537985024 65536 This container will be allowed to write to disk within the allocated foreign id range 2147352576 to 2147352576 + 65536. To do this an idmapped mount must be created from an already idmapped mount such that: - The mappings for the user's uid/gid must be dropped, i.e., the following mappings are removed: 65536:$(id -u):1 // uid mapping 65536:$(id -g):1 // gid mapping - A mapping for the transient uid/gid range to the foreign uid/gid range is added: 2147352576:537985024:65536 In combination this will mean that the container will write to disk within the foreign id range 2147352576 to 2147352576 + 65536. /* nested containers */ When the outer container makes use of idmapped mounts it isn't posssible to create an idmapped mount for the inner container with a differen idmapping from the outer container's idmapped mount. There are other usecases and the two above just serve as an illustration of the problem. This patchset makes it possible to create a new idmapped mount from an already idmapped mount. It aims to adhere to current performance constraints and requirements: - Idmapped mounts aim to have near zero performance implications for path lookup. That is why no refernce counting, locking or any other mechanism can be required that would impact performance. This works be ensuring that a regular mount transitions to an idmapped mount once going from a static nop_mnt_idmap mapping to a non-static idmapping. - The idmapping of a mount change anymore for the lifetime of the mount afterwards. This not just avoids UAF issues it also avoids pitfalls such as generating non-matching uid/gid values. Changing idmappings could be solved by: - Idmappings could simply be reference counted (above the simple reference count when sharing them across multiple mounts). This would require pairing mnt_idmap_get() with mnt_idmap_put() which would end up being sprinkled everywhere into the VFS and some filesystems that access idmappings directly. It wouldn't just be quite ugly and introduce new complexity it would have a noticeable performance impact. - Idmappings could gain RCU protection. This would help the LOOKUP_RCU case and avoids taking reference counts under RCU. When not under LOOKUP_RCU reference counts need to be acquired on each idmapping. This would require pairing mnt_idmap_get() with mnt_idmap_put() which would end up being sprinkled everywhere into the VFS and some filesystems that access idmappings directly. This would have the same downsides as mentioned earlier. - The earlier solutions work by updating the mnt->mnt_idmap pointer with the new idmapping. Instead of this it would be possible to change the idmapping itself to avoid UAF issues. To do this a sequence counter would have to be added to struct mount. When retrieving the idmapping to generate uid/gid values the sequence counter would need to be sampled and the generation of the uid/gid would spin until the update of the idmap is finished. This has problems as well but the biggest issue will be that this can lead to inconsistent permission checking and inconsistent uid/gid pairs even more than this is already possible today. Specifically, during creation it could happen that: idmap = mnt_idmap(mnt); inode_permission(idmap, ...); may_create(idmap); // create file with uid/gid based on @idmap in between the permission checking and the generation of the uid/gid value the idmapping could change leading to the permission checking and uid/gid value that is actually used to create a file on disk being out of sync. Similarly if two values are generated like: idmap = mnt_idmap(mnt) vfsgid = make_vfsgid(idmap); // idmapping gets update concurrently vfsuid = make_vfsuid(idmap); @vfsgid and @vfsuid could be out of sync if the idmapping was changed in between. The generation of vfsgid/vfsuid could span a lot of codelines so to guard against this a sequence count would have to be passed around. The performance impact of this solutio are less clear but very likely not zero. - Using SRCU similar to fanotify that can sleep. I find that not just ugly but it would have memory consumption implications and is overall pretty ugly. /* solution */ So, to avoid all of these pitfalls creating an idmapped mount from an already idmapped mount will be done atomically, i.e., a new detached mount is created and a new set of mount properties applied to it without it ever having been exposed to userspace at all. This can be done in two ways. A new flag to open_tree() is added OPEN_TREE_CLEAR_IDMAP that clears the old idmapping and returns a mount that isn't idmapped. And then it is possible to set mount attributes on it again including creation of an idmapped mount. This has the consequence that a file descriptor must exist in userspace that doesn't have any idmapping applied and it will thus never work in unpriviledged scenarios. As a container would be able to remove the idmapping of the mount it has been given. That should be avoided. Instead, we add open_tree_attr() which works just like open_tree() but takes an optional struct mount_attr parameter. This is useful beyond idmappings as it fills a gap where a mount never exists in userspace without the necessary mount properties applied. This is particularly useful for mount options such as MOUNT_ATTR_{RDONLY,NOSUID,NODEV,NOEXEC}. To create a new idmapped mount the following works: // Create a first idmapped mount struct mount_attr attr = { .attr_set = MOUNT_ATTR_IDMAP .userns_fd = fd_userns }; fd_tree = open_tree(-EBADF, "/", OPEN_TREE_CLONE, &attr, sizeof(attr)); move_mount(fd_tree, "", -EBADF, "/mnt", MOVE_MOUNT_F_EMPTY_PATH); // Create a second idmapped mount from the first idmapped mount attr.attr_set = MOUNT_ATTR_IDMAP; attr.userns_fd = fd_userns2; fd_tree2 = open_tree(-EBADF, "/mnt", OPEN_TREE_CLONE, &attr, sizeof(attr)); // Create a second non-idmapped mount from the first idmapped mount: memset(&attr, 0, sizeof(attr)); attr.attr_clr = MOUNT_ATTR_IDMAP; fd_tree2 = open_tree(-EBADF, "/mnt", OPEN_TREE_CLONE, &attr, sizeof(attr)); * patches from https://lore.kernel.org/r/20250128-work-mnt_idmap-update-v2-v1-0-c25feb0d2eb3@kernel.org: fs: allow changing idmappings fs: add kflags member to struct mount_kattr fs: add open_tree_attr() fs: add copy_mount_setattr() helper fs: add vfs_open_tree() helper Link: https://lore.kernel.org/r/20250128-work-mnt_idmap-update-v2-v1-0-c25feb0d2eb3@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-12fs: add open_tree_attr()Christian Brauner
Add open_tree_attr() which allow to atomically create a detached mount tree and set mount options on it. If OPEN_TREE_CLONE is used this will allow the creation of a detached mount with a new set of mount options without it ever being exposed to userspace without that set of mount options applied. Link: https://lore.kernel.org/r/20250128-work-mnt_idmap-update-v2-v1-3-c25feb0d2eb3@kernel.org Reviewed-by: "Seth Forshee (DigitalOcean)" <sforshee@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-12statmount: add a new supported_mask fieldJeff Layton
Some of the fields in the statmount() reply can be optional. If the kernel has nothing to emit in that field, then it doesn't set the flag in the reply. This presents a problem: There is currently no way to know what mask flags the kernel supports since you can't always count on them being in the reply. Add a new STATMOUNT_SUPPORTED_MASK flag and field that the kernel can set in the reply. Userland can use this to determine if the fields it requires from the kernel are supported. This also gives us a way to deprecate fields in the future, if that should become necessary. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20250206-statmount-v2-1-6ae70a21c2ab@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-12statmount: allow to retrieve idmappingsChristian Brauner
This adds the STATMOUNT_MNT_UIDMAP and STATMOUNT_MNT_GIDMAP options. It allows the retrieval of idmappings via statmount(). Currently it isn't possible to figure out what idmappings are applied to an idmapped mount. This information is often crucial. Before statmount() the only realistic options for an interface like this would have been to add it to /proc/<pid>/fdinfo/<nr> or to expose it in /proc/<pid>/mountinfo. Both solution would have been pretty ugly and would've shown information that is of strong interest to some application but not all. statmount() is perfect for this. The idmappings applied to an idmapped mount are shown relative to the caller's user namespace. This is the most useful solution that doesn't risk leaking information or confuse the caller. For example, an idmapped mount might have been created with the following idmappings: mount --bind -o X-mount.idmap="0:10000:1000 2000:2000:1 3000:3000:1" /srv /opt Listing the idmappings through statmount() in the same context shows: mnt_id: 2147485088 mnt_parent_id: 2147484816 fs_type: btrfs mnt_root: /srv mnt_point: /opt mnt_opts: ssd,discard=async,space_cache=v2,subvolid=5,subvol=/ mnt_uidmap[0]: 0 10000 1000 mnt_uidmap[1]: 2000 2000 1 mnt_uidmap[2]: 3000 3000 1 mnt_gidmap[0]: 0 10000 1000 mnt_gidmap[1]: 2000 2000 1 mnt_gidmap[2]: 3000 3000 1 But the idmappings might not always be resolvable in the caller's user namespace. For example: unshare --user --map-root In this case statmount() will skip any mappings that fil to resolve in the caller's idmapping: mnt_id: 2147485087 mnt_parent_id: 2147484016 fs_type: btrfs mnt_root: /srv mnt_point: /opt mnt_opts: ssd,discard=async,space_cache=v2,subvolid=5,subvol=/ The caller can differentiate between a mount not being idmapped and a mount that is idmapped but where all mappings fail to resolve in the caller's idmapping by check for the STATMOUNT_MNT_{G,U}IDMAP flag being raised but the number of mappings in ->mnt_{g,u}idmap_num being zero. Note that statmount() requires that the whole range must be resolvable in the caller's user namespace. If a subrange fails to map it will still list the map as not resolvable. This is a practical compromise to avoid having to find which subranges are resovable and wich aren't. Idmappings are listed as a string array with each mapping separated by zero bytes. This allows to retrieve the idmappings and immediately use them for writing to e.g., /proc/<pid>/{g,u}id_map and it also allow for simple iteration like: if (stmnt->mask & STATMOUNT_MNT_UIDMAP) { const char *idmap = stmnt->str + stmnt->mnt_uidmap; for (size_t idx = 0; idx < stmnt->mnt_uidmap_nr; idx++) { printf("mnt_uidmap[%lu]: %s\n", idx, idmap); idmap += strlen(idmap) + 1; } } Link: https://lore.kernel.org/r/20250204-work-mnt_idmap-statmount-v2-2-007720f39f2e@kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-12uidgid: add map_id_range_up()Christian Brauner
Add map_id_range_up() to verify that the full kernel id range can be mapped up in a given idmapping. This will be used in follow-up patches. Link: https://lore.kernel.org/r/20250204-work-mnt_idmap-statmount-v2-1-007720f39f2e@kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-12drm/sched: Use struct for drm_sched_init() paramsPhilipp Stanner
drm_sched_init() has a great many parameters and upcoming new functionality for the scheduler might add even more. Generally, the great number of parameters reduces readability and has already caused one missnaming, addressed in: commit 6f1cacf4eba7 ("drm/nouveau: Improve variable name in nouveau_sched_init()"). Introduce a new struct for the scheduler init parameters and port all users. Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Acked-by: Matthew Brost <matthew.brost@intel.com> # for Xe Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> # for Panfrost and Panthor Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> # for Etnaviv Reviewed-by: Frank Binns <frank.binns@imgtec.com> # for Imagination Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> # for Sched Reviewed-by: MaĆ­ra Canal <mcanal@igalia.com> # for v3d Reviewed-by: Danilo Krummrich <dakr@kernel.org> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> # for amdxdna Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250211111422.21235-2-phasta@kernel.org
2025-02-12gpiolib: add gpiod_multi_set_value_cansleep()David Lechner
Add a new gpiod_multi_set_value_cansleep() helper function with fewer parameters than gpiod_set_array_value_cansleep(). Calling gpiod_set_array_value_cansleep() can get quite verbose. In many cases, the first arguments all come from the same struct gpio_descs, so having a separate function where we can just pass that cuts down on the boilerplate. Signed-off-by: David Lechner <dlechner@baylibre.com> Reviewed-by: Andy Shevchenko <andy@kernel.org> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20250210-gpio-set-array-helper-v3-1-d6a673674da8@baylibre.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-02-12drm: ensure drm headers are self-contained and pass kernel-docJani Nikula
Ensure drm headers build, are self-contained, have header guards, and have no kernel-doc warnings, when CONFIG_DRM_HEADER_TEST=y. The mechanism follows similar patters used in i915, xe, and usr/include. To cover include/drm, we need to recurse there using the top level Kbuild and the new include/Kbuild files. v4: check for CONFIG_WERROR in addition to CONFIG_DRM_WERROR v3: adapt to upstream build changes v2: make DRM_HEADER_TEST depend on DRM Suggested-by: Daniel Vetter <daniel@ffwll.ch> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Simona Vetter <simona.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/d8ad1c6d707f38a55987f616cb9650aef30b84e1.1737556766.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-02-12drm/client: include types.h to make drm_client_event.h self-containedJani Nikula
drm_client_event.h uses bool without types.h, include it. Fixes: bf17766f1083 ("drm/client: Move suspend/resume into DRM client callbacks") Cc: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Simona Vetter <simona.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/6c69cb005cc3a2b968b6c9675d0ba03173f6c574.1737556766.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-02-12f2fs: add ioctl to get IO priority hintJaegeuk Kim
This patch adds an ioctl to give a per-file priority hint to attach REQ_PRIO. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-02-11net: phy: remove unused PHY_INIT_TIMEOUT and PHY_FORCE_TIMEOUTHeiner Kallweit
Both definitions are unused. Last users have been removed with: f3ba9d490d6e ("net: s6gmac: remove driver") 2bd229df5e2e ("net: phy: remove state PHY_FORCING") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/f8e7b8ed-a665-41ad-b0ce-cbfdb65262ef@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11net: phy: rename phy_set_eee_broken to phy_disable_eee_modeHeiner Kallweit
Consider that an EEE mode may not be broken but simply not supported by the MAC, and rename function phy_set_eee_broken(). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/30deb630-3f6b-4ffb-a1e6-a9736021f43a@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11net: phy: rename eee_broken_modes to eee_disabled_modesHeiner Kallweit
This bitmap is used also if the MAC doesn't support an EEE mode. So the mode isn't necessarily broken in the PHY. Therefore rename the bitmap. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/6cd11422-dd67-4c87-a642-308de694af92@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11block: introduce init_wait_func()Muchun Song
There is already a macro DEFINE_WAIT_FUNC() to declare a wait_queue_entry with a specified waking function. But there is not a counterpart for initializing one wait_queue_entry with a specified waking function. So introducing init_wait_func() for this, which also could be used in iocost and rq-qos. Using default_wake_function() in rq_qos_wait() to wake up waiters, which could remove ->task field from rq_qos_wait_data. Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/r/20250208090416.38642-1-songmuchun@bytedance.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-11thermal/netlink: Prevent userspace segmentation fault by adjusting UAPI headerZhang Rui
The intel-lpmd tool [1], which uses the THERMAL_GENL_ATTR_CPU_CAPABILITY attribute to receive HFI events from kernel space, encounters a segmentation fault after commit 1773572863c4 ("thermal: netlink: Add the commands and the events for the thresholds"). The issue arises because the THERMAL_GENL_ATTR_CPU_CAPABILITY raw value was changed while intel_lpmd still uses the old value. Although intel_lpmd can be updated to check the THERMAL_GENL_VERSION and use the appropriate THERMAL_GENL_ATTR_CPU_CAPABILITY value, the commit itself is questionable. The commit introduced a new element in the middle of enum thermal_genl_attr, which affects many existing attributes and introduces potential risks and unnecessary maintenance burdens for userspace thermal netlink event users. Solve the issue by moving the newly introduced THERMAL_GENL_ATTR_TZ_PREV_TEMP attribute to the end of the enum thermal_genl_attr. This ensures that all existing thermal generic netlink attributes remain unaffected. Link: https://github.com/intel/intel-lpmd [1] Fixes: 1773572863c4 ("thermal: netlink: Add the commands and the events for the thresholds") Signed-off-by: Zhang Rui <rui.zhang@intel.com> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/20250208074907.5679-1-rui.zhang@intel.com [ rjw: Subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-02-11dt-bindings: iio: adc: adi,ad4695: add SPI offload propertiesDavid Lechner
Add a pwms property to the adi,ad4695 binding to specify an optional PWM output connected to the CNV pin on the ADC. Also add #trigger-source-cells property to allow the BUSY output to be used as a SPI offload trigger source to indicate when a sample is ready to be read. Macros are added to adi,ad4695.h for the cell values to help with readability since they are arbitrary values. Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://patch.msgid.link/20250207-dlech-mainline-spi-engine-offload-2-v8-13-e48a489be48c@baylibre.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-02-11pwm: lpss: Clarify the bypass member semantics in struct pwm_lpss_boardinfoAndy Shevchenko
Instead of an odd comment, cite the documentation, which says more clearly what's going on with the programming flow on some of the Intel SoCs. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2025-02-11spi: offload: types: include linux/bits.hDavid Lechner
Add #include <linux/bits.h> to linux/spi/offload/types.h since this file uses the BIT macro. Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com> Signed-off-by: David Lechner <dlechner@baylibre.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://patch.msgid.link/20250210-spi-offload-extra-headers-v1-1-0f3356362254@baylibre.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-02-11mtd: rawnand: qcom: finish converting register to FIELD_PREPChristian Marangi
With some research in some obscure old QSDK, it was possible to find the MASK of the last register there were still set with raw shift and convert them to FIELD_PREP API. This is only a cleanup and modernize the code a bit and doesn't make any behaviour change. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2025-02-11tcp: add tcp_rto_max_ms sysctlEric Dumazet
Previous patch added a TCP_RTO_MAX_MS socket option to tune a TCP socket max RTO value. Many setups prefer to change a per netns sysctl. This patch adds /proc/sys/net/ipv4/tcp_rto_max_ms Its initial value is 120000 (120 seconds). Keep in mind that a decrease of tcp_rto_max_ms means shorter overall timeouts, unless tcp_retries2 sysctl is increased. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11tcp: add the ability to control max RTOEric Dumazet
Currently, TCP stack uses a constant (120 seconds) to limit the RTO value exponential growth. Some applications want to set a lower value. Add TCP_RTO_MAX_MS socket option to set a value (in ms) between 1 and 120 seconds. It is discouraged to change the socket rto max on a live socket, as it might lead to unexpected disconnects. Following patch is adding a netns sysctl to control the default value at socket creation time. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11tcp: add a @pace_delay parameter to tcp_reset_xmit_timer()Eric Dumazet
We want to factorize calls to inet_csk_reset_xmit_timer(), to ease TCP_RTO_MAX change. Current users want to add tcp_pacing_delay(sk) to the timeout. Remaining calls to inet_csk_reset_xmit_timer() do not add the pacing delay. Following patch will convert them, passing false for @pace_delay. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11tcp: remove tcp_reset_xmit_timer() @max_when argumentEric Dumazet
All callers use TCP_RTO_MAX, we can factorize this constant, becoming a variable soon. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11wifi: mac80211: set ieee80211_prep_tx_info::link_id upon Auth RxEmmanuel Grumbach
This will be used by the low level driver. Note that link_id will be 0 in case of a non-MLO authentication. Also fix a call-site of mgd_prepare_tx() where the link_id was not populated. Update the documentation to reflect the current state ieee80211_prep_tx_info::link_id is also available in mgd_complete_tx(). Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250205110958.6a590f189ce5.I1fc5c0da26b143f5b07191eb592f01f7083d55ae@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-02-11wifi: mac80211: add strict mode disabling workaroundsJohannes Berg
Add a strict mode where we disable certain workarounds and have additional checks such as, for now, that VHT capabilities from association response match those from beacon/probe response. We can extend the checks in the future. Make it an opt-in setting by the driver so it can be set there in some driver-specific way, for example. Also allow setting this one hw flag through the hwflags debugfs, by writing a new strict=0 or strict=1 value. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250205110958.5cecb0469479.I4a69617dc60ba0d6308416ffbc3102cfd08ba068@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-02-11wifi: ieee80211: Add missing EHT MAC capabilitiesIlan Peer
Add missing EHT MAC capabilities definitions. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250205110958.6c1643c345a1.I7405b9c35cb39ae97a52c3fbcc36b0bd81e495dc@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-02-11wifi: mac80211: Add support for EPCS configurationIlan Peer
Add support for configuring EPCS state: - When EPCS is enabled, send an EPCS enable request action frame to the AP. When the AP replies with EPCS enable response, enable EPCS by applying the QoS parameters provided by the AP. Do so for all the valid MLD links. Once EPCS is enabled, support processing of unsolicited EPCS enable response frames. - When EPCS is disabled, send an EPCS teardown request to the AP and apply the QoS parameters as obtained from the last received beacons. Do so for all the valid links. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250205110958.7a90afd7e140.I3f602d65f5c1fd849d6c70b12307dda33aa91ccb@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-02-11wifi: mac80211: Drop cooked monitor supportAlexander Wetzel
Hostapd switched from cooked monitor interfaces to nl80211 Dec 2011. Drop support for the outdated cooked monitor interfaces and fix creating the virtual monitor interfaces in the following cases: 1) We have one non-monitor and one monitor interface with %MONITOR_FLAG_ACTIVE enabled and then delete the non-monitor interface. 2) We only have monitor interfaces enabled on resume while at least one has %MONITOR_FLAG_ACTIVE set. Signed-off-by: Alexander Wetzel <Alexander@wetzel-home.de> Link: https://patch.msgid.link/20250204111352.7004-2-Alexander@wetzel-home.de Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-02-11wifi: nl80211/cfg80211: Stop supporting cooked monitorAlexander Wetzel
Unconditionally start to refuse creating cooked monitor interfaces to phase them out. There is no feature flag for drivers to opt-in for cooked monitor and all known users are using/preferring the modern API since the hostapd release 1.0 in May 2012. Signed-off-by: Alexander Wetzel <Alexander@wetzel-home.de> Link: https://patch.msgid.link/20250204111352.7004-1-Alexander@wetzel-home.de Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-02-10net: fib_rules: Factorise fib_newrule() and fib_delrule().Kuniyuki Iwashima
fib_nl_newrule() / fib_nl_delrule() is the doit() handler for RTM_NEWRULE / RTM_DELRULE but also called from vrf_newlink(). Currently, we hold RTNL on both paths but will not on the former. Also, we set dev_net(dev)->rtnl to skb->sk in vrf_fib_rule() because fib_nl_newrule() / fib_nl_delrule() fetch net as sock_net(skb->sk). Let's Factorise the two functions and pass net and rtnl_held flag. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250207072502.87775-6-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-10vrf: use RCU protection in l3mdev_l3_out()Eric Dumazet
l3mdev_l3_out() can be called without RCU being held: raw_sendmsg() ip_push_pending_frames() ip_send_skb() ip_local_out() __ip_local_out() l3mdev_ip_out() Add rcu_read_lock() / rcu_read_unlock() pair to avoid a potential UAF. Fixes: a8e3e1a9f020 ("net: l3mdev: Add hook to output path") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250207135841.1948589-7-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-10r8152: add vendor/device ID pair for Dell Alienware AW1022zAleksander Jan Bajkowski
The Dell AW1022z is an RTL8156B based 2.5G Ethernet controller. Add the vendor and product ID values to the driver. This makes Ethernet work with the adapter. Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Link: https://patch.msgid.link/20250206224033.980115-1-olek2@wp.pl Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-10xsk: add helper to get &xdp_desc's DMA and meta pointer in one goAlexander Lobakin
Currently, when your driver supports XSk Tx metadata and you want to send an XSk frame, you need to do the following: * call external xsk_buff_raw_get_dma(); * call inline xsk_buff_get_metadata(), which calls external xsk_buff_raw_get_data() and then do some inline checks. This effectively means that the following piece: addr = pool->unaligned ? xp_unaligned_add_offset_to_addr(addr) : addr; is done twice per frame, plus you have 2 external calls per frame, plus this: meta = pool->addrs + addr - pool->tx_metadata_len; if (unlikely(!xsk_buff_valid_tx_metadata(meta))) is always inlined, even if there's no meta or it's invalid. Add xsk_buff_raw_get_ctx() (xp_raw_get_ctx() to be precise) to do that in one go. It returns a small structure with 2 fields: DMA address, filled unconditionally, and metadata pointer, non-NULL only if it's present and valid. The address correction is performed only once and you also have only 1 external call per XSk frame, which does all the calculations and checks outside of your hotpath. You only need to check `if (ctx.meta)` for the metadata presence. To not copy any existing code, derive address correction and getting virtual and DMA address into small helpers. bloat-o-meter reports no object code changes for the existing functionality. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20250206182630.3914318-5-aleksander.lobakin@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-10unroll: add generic loop unroll helpersAlexander Lobakin
There are cases when we need to explicitly unroll loops. For example, cache operations, filling DMA descriptors on very high speeds etc. Add compiler-specific attribute macros to give the compiler a hint that we'd like to unroll a loop. Example usage: #define UNROLL_BATCH 8 unrolled_count(UNROLL_BATCH) for (u32 i = 0; i < UNROLL_BATCH; i++) op(priv, i); Note that sometimes the compilers won't unroll loops if they think this would have worse optimization and perf than without unrolling, and that unroll attributes are available only starting GCC 8. For older compiler versions, no hints/attributes will be applied. For better unrolling/parallelization, don't have any variables that interfere between iterations except for the iterator itself. Co-developed-by: Jose E. Marchesi <jose.marchesi@oracle.com> # pragmas Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20250206182630.3914318-2-aleksander.lobakin@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-10crash: Remove KEXEC_CORE_NOTE_NAMEAkihiko Odaki
KEXEC_CORE_NOTE_NAME is no longer used. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Acked-by: Baoquan He <bhe@redhat.com> Reviewed-by: Dave Martin <Dave.Martin@arm.com> Link: https://lore.kernel.org/r/20250115-elf-v5-6-0f9e55bbb2fc@daynix.com Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-10crash: Use note name macrosAkihiko Odaki
Use note name macros to match with the userspace's expectation. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Acked-by: Baoquan He <bhe@redhat.com> Reviewed-by: Dave Martin <Dave.Martin@arm.com> Link: https://lore.kernel.org/r/20250115-elf-v5-4-0f9e55bbb2fc@daynix.com Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-10elf: Define note name macrosAkihiko Odaki
elf.h had a comment saying: > Notes used in ET_CORE. Architectures export some of the arch register > sets using the corresponding note types via the PTRACE_GETREGSET and > PTRACE_SETREGSET requests. > The note name for these types is "LINUX", except NT_PRFPREG that is > named "CORE". However, NT_PRSTATUS is also named "CORE". It is also unclear what "these types" refers to. To fix these problems, define a name for each note type. The added definitions are macros so the kernel and userspace can directly refer to them to remove their duplicate definitions of note names. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Acked-by: Baoquan He <bhe@redhat.com> Reviewed-by: Dave Martin <Dave.Martin@arm.com> Link: https://lore.kernel.org/r/20250115-elf-v5-1-0f9e55bbb2fc@daynix.com Signed-off-by: Kees Cook <kees@kernel.org>
2025-02-10hrtimers: Make hrtimer_update_function() less expensiveThomas Gleixner
The sanity checks in hrtimer_update_function() are expensive for high frequency usage like in the io/uring code due to locking. Hide the sanity checks behind CONFIG_PROVE_LOCKING, which has a decent chance to be enabled on a regular basis for testing. Fixes: 8f02e3563bb5 ("hrtimers: Introduce hrtimer_update_function()") Reported-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/87ikpllali.ffs@tglx
2025-02-10iio: buffer-dmaengine: add devm_iio_dmaengine_buffer_setup_with_handle()David Lechner
Add a new devm_iio_dmaengine_buffer_setup_with_handle() function to handle cases where the DMA channel is managed by the caller rather than being requested and released by the iio_dmaengine module. Reviewed-by: Nuno Sa <nuno.sa@analog.com> Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://patch.msgid.link/20250207-dlech-mainline-spi-engine-offload-2-v8-9-e48a489be48c@baylibre.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-02-10iio: buffer-dmaengine: split requesting DMA channel from allocating bufferDavid Lechner
Refactor the IIO dmaengine buffer code to split requesting the DMA channel from allocating the buffer. We want to be able to add a new function where the IIO device driver manages the DMA channel, so these two actions need to be separate. To do this, calling dma_request_chan() is moved from iio_dmaengine_buffer_alloc() to iio_dmaengine_buffer_setup_ext(). A new __iio_dmaengine_buffer_setup_ext() helper function is added to simplify error unwinding and will also be used by a new function in a later patch. iio_dmaengine_buffer_free() now only frees the buffer and does not release the DMA channel. A new iio_dmaengine_buffer_teardown() function is added to unwind everything done in iio_dmaengine_buffer_setup_ext(). This keeps things more symmetrical with obvious pairs alloc/free and setup/teardown. Calling dma_get_slave_caps() in iio_dmaengine_buffer_alloc() is moved so that we can avoid any gotos for error unwinding. Reviewed-by: Nuno Sa <nuno.sa@analog.com> Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://patch.msgid.link/20250207-dlech-mainline-spi-engine-offload-2-v8-8-e48a489be48c@baylibre.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>