summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2025-07-19mm/vmstat: make MEMCG select VM_EVENT_COUNTERSKirill A. Shutemov
The vmstat_text array contains labels for counters displayed in /proc/vmstat. It is important to keep the labels in sync with the counters. There is a BUILD_BUG_ON() check in vmstat_start() that ensures the size of the vmstat_text is not smaller than VM_EVENT_COUNTERS. This helps to catch cases where a new counter is added but the label is not. However, it does not help if a counter is removed but the label remains. It would be nice to make the BUILD_BUG_ON() check more strict to catch such cases. However, when compiling with MEMCG enabled but VM_EVENT_COUNTERS disabled, the vmstat_text array is larger than NR_VMSTAT_ITEMS. This issue arises because some elements of the vmstat_text array are present when either MEMCG or VM_EVENT_COUNTERS is enabled, but NR_VMSTAT_ITEMS only accounts for these elements if VM_EVENT_COUNTERS is enabled. Instead of adjusting the NR_VMSTAT_ITEMS definition to account for MEMCG, make MEMCG select VM_EVENT_COUNTERS. VM_EVENT_COUNTERS is enabled in most configurations anyway. Link: https://lkml.kernel.org/r/20250604095111.533783-1-kirill.shutemov@linux.intel.com Fixes: ebc5d83d0443 ("mm/memcontrol: use vmstat names for printing statistics") Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Tested-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Shakeel Butt <shakeel.butt@linux.dev> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-19mm/damon/sysfs: use DAMON core API damon_is_running()SeongJae Park
DAMON core implements a static function to see if a given DAMON context is running. DAMON sysfs interface is implementing the same one on its own. Make the core function non-static and reuse it from the DAMON sysfs interface. Link: https://lkml.kernel.org/r/20250705175000.56259-5-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-19Merge tag 'vfs-6.16-rc7.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Fix a memory leak in fcntl_dirnotify() - Raise SB_I_NOEXEC on secrement superblock instead of messing with flags on the mount - Add fsdevel and block mailing lists to uio entry. We had a few instances were very questionable stuff was added without either block or the VFS being aware of it - Fix netfs copy-to-cache so that it performs collection with ceph+fscache - Fix netfs race between cache write completion and ALL_QUEUED being set - Verify the inode mode when loading entries from disk in isofs - Avoid state_lock in iomap_set_range_uptodate() - Fix PIDFD_INFO_COREDUMP check in PIDFD_GET_INFO ioctl - Fix the incorrect return value in __cachefiles_write() * tag 'vfs-6.16-rc7.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: MAINTAINERS: add block and fsdevel lists to iov_iter netfs: Fix race between cache write completion and ALL_QUEUED being set netfs: Fix copy-to-cache so that it performs collection with ceph+fscache fix a leak in fcntl_dirnotify() iomap: avoid unnecessary ifs_set_range_uptodate() with locks isofs: Verify inode mode when loading from disk cachefiles: Fix the incorrect return value in __cachefiles_write() secretmem: use SB_I_NOEXEC coredump: fix PIDFD_INFO_COREDUMP ioctl check
2025-07-19Merge tag 'iio-for-6.17a' of ↵Greg Kroah-Hartman
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-next Jonathan writes: IIO: New device support, features, late breaking fixes and cleanup for 6.17 The normal mixed bag. A few more fixes than usual as I failed to send them out earlier. New device support ================== adi,ad4080 - New driver for this high speed ADC. Includes extensions to iio-backends necessary to support filter config, variable data lands and data alignment control. adi,ad4170-4 - New driver for this 24-bit very feature rich ADC suited for weigh scale and thermocouple applications. adi,ad7405 - New driver for this single channel isolated ADC with backend support (adi-axi-adc) google,cros_ec_activity - Add activity detection to the existing set of cros_ec drivers covering both human body and significant motion detection. mediatek,mt6359 - Add support for MT6363 and MT6373 PMIC Auxiliary ADCs. nicera,d3-323-aa - New driver for this configurable Passive InfraRed sensor. Device ID only ============== mediatek,mt7981-auxadc - Add ID to mt2701 driver as fully compatible with mt7986-auxadc. rohm,bu79100g - Add ID to ad7476 driver as fully compatible with TI ADS7866. Features ======== Core - New in_voltageY_convdelay to allow for devices to control timing offsets between sampling different channels. adi,ad-sigma-delta-library - Support SPI offload (later fix for missing Kconfig dependency) adi,ad4851 - SPI 3-wire support. adi,ad7606 - Power supply control. - convdelay and calibbias support for calibration purposes. - gain calibration support based on external filter resistance provided from device tree. adi,ad7768-1 - Add output regulator for VCM output, typically used for preconditioning circuits. - Add gpio controller for the 4 GPIOs. - Multiple scan type support to enable 16-bit modes. - Support synchronization over SPI. - Filter type and oversampling ratio control. - Low pass filter cut off read only attribute. adi,adxl313 - FIFO support - DC activity, inactivity detection with power-save on inactivity - AC coupled activity detection - Documentation for this complex driver. - debugfs register access. adi,adxl345 - Sampling frequency and sensor range controls. bosch,bmi270 - Add step counter support. invensense,icm42600 - Wake on motion support. Cleanup and fixes ================= backend - Drop unused parameter from iio_backend_ovesampling_ratio_set() docs - Fix ABI docs around I and Q modifiers. treewide - Switch remaining drives to use maple tree regcache. - Drop use of DRIVER_NAME style definitions when only used in one place. - Drop unused export.h includes. - Use = { } in place of memset in various drivers. - Constify various info structures and related. - Switch some drivers from array of chip_info structures to individual named structures. adi,ad-sigma_delta library - Fix over allocation of scan buffer. (bits/bytes confusion) - Sort includes and apply iwyu principles to ensure sensible set. - Use u8 instead of uint8_t - Replace hard coded type sizes with sizeof() and BITS_TO_BYTES() as appropriate. - Factor out setting of read address to reduce duplication. - Switch to buffer predisable so error handling on buffer enable functions correctly (balanced against postenable). adi,ad4000 - Don't use sift_right() on an unsigned value. adi,ad7173 - Add missing check on spi_setup() succeeding. - Simplify clock enable disable code using devm_clk_get_enabled() - Fix channel index for syscalib_mode - Fix number of configuration slots for some devics. - Fix the channel used for calibration. - Fix setting ODR up in probe. adi,ad7380 - Drop unused oversampling_ratio getter function call as value never used. adi,ad7606 - Exit if invalid dt_schema encountered rather than carrying on with unknown config. adi,ad7768-1 - Ensure SYNC_IN pulse is long enough. - Switch sampling_frequency_available to read_avail() callback. adi,ada4250 - Ensuring a dma-safe buffer for regmap_bulk_read() - Use a local dev variable to simplify code - Relax chip ID matching to allow for fallback dt compatibles. - Make use of devm_regulator_get_enabled_read_voltage() to replace equivalent code. - Shuffle elements around in struct to improve logical groupings and reduce holes. - Use dev_err_probe() adi,adxl313 - Use regcache to reduce traffic. - Factor out enabling of measurement. adi,adxl345 - Drop irq from struct as only used locally in code - Simplify measure enable function using regmap_update_bits() - Replace some magic numbers by units.h defines - Simplify interrupt mapping code - Simplify FIFO read out. adi,axi-dac - Factor out code to check for bus free to reduce duplication. avago,apds9306 - Use a helper to get register address in both get and set functions. bosch,bmi160+bmi270 - Ensure triggers suspended and resumed correctly. bosch,bmo055 - Fix theoretical OOB acces to hw_xlate array. freescale,vf610 - Drop -ENOMEM error message as plenty of existing prints if memory allocation fails. - Use dev_err_probe() and devm_clk_geT_enabled() to simplify probe(). kionix,kx022a - Apply include what you use principles to includes. invensense,itg3200 - Add missing dt-binding for this gyroscope. invensense,icm42600 - Switch from int64_t and similar to s64 and other kernel types. - Simplify arrangement of DMA safe buffers and potentially reduce structure size a little. invensense,mpu6050 - Reduce duplication in aux read/write code. - Use sysfs_emit() to replace scnprintf() murata,irsd200 - Drop duplicate printing of ret in dev_err_probe() nxp,lpc3220-adc - Add missing clocks property to dt-binding. st,spear600 - Convert dt-binding that got left behind in staging to yaml in the main tree. st,stm32-adc - Use dev_fwnode() rather than directly accessing the of_node. vti,sca3000 - Use direct returns instead of gotos where simple. Various other minor typo and white space fixes. * tag 'iio-for-6.17a' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jic23/iio: (201 commits) iio: adc: ad_sigma_delta: Select IIO_BUFFER_DMAENGINE and SPI_OFFLOAD iio: adc: ad7173: fix setting ODR in probe iio: adc: ad7173: fix calibration channel iio: adc: ad7173: fix num_slots iio: adc: ad7173: fix channels index for syscalib_mode iio: adc: ad_sigma_delta: change to buffer predisable iio: ABI: fix correctness of I and Q modifiers iio: Add driver for Nicera D3-323-AA PIR sensor dt-bindings: iio: proximity: Add Nicera D3-323-AA PIR sensor dt-bindings: vendor-prefixes: Add Nicera iio: dac: vf610: Simplify with devm_clk_get_enabled() iio: adc: vf610: Simplify with dev_err_probe iio: adc: vf610: Drop -ENOMEM error message iio: imu: bno055: make bno055_sysfs_attr const iio: imu: bno055: fix OOB access of hw_xlate array dt-bindings: iio: adc: Add support for MT7981 iio: accel: kionix-kx022a: Apply approximate iwyu principles to includes iio: adc: ad4170-4: Add support for weigh scale, thermocouple, and RTD sens iio: adc: ad4170-4: Add support for internal temperature sensor iio: adc: ad4170-4: Add GPIO controller support ...
2025-07-19srcu: Add guards for SRCU-fast readersPaul E. McKenney
This adds the usual scoped_guard(srcu_fast, &my_srcu) and guard(srcu_fast)(&my_srcu). Suggested-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.upadhyay@kernel.org>
2025-07-18net: s/dev_close_many/netif_close_many/Stanislav Fomichev
Commit cc34acd577f1 ("docs: net: document new locking reality") introduced netif_ vs dev_ function semantics: the former expects locked netdev, the latter takes care of the locking. We don't strictly follow this semantics on either side, but there are more dev_xxx handlers now that don't fit. Rename them to netif_xxx where appropriate. netif_close_many is used only by vlan/dsa and one mtk driver, so move it into NETDEV_INTERNAL namespace. Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250717172333.1288349-8-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-18net: s/dev_set_threaded/netif_set_threaded/Stanislav Fomichev
Commit cc34acd577f1 ("docs: net: document new locking reality") introduced netif_ vs dev_ function semantics: the former expects locked netdev, the latter takes care of the locking. We don't strictly follow this semantics on either side, but there are more dev_xxx handlers now that don't fit. Rename them to netif_xxx where appropriate. Note that one dev_set_threaded call still remains in mt76 for debugfs file. Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250717172333.1288349-7-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-18net: s/dev_get_flags/netif_get_flags/Stanislav Fomichev
Commit cc34acd577f1 ("docs: net: document new locking reality") introduced netif_ vs dev_ function semantics: the former expects locked netdev, the latter takes care of the locking. We don't strictly follow this semantics on either side, but there are more dev_xxx handlers now that don't fit. Rename them to netif_xxx where appropriate. Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250717172333.1288349-6-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-18net: s/__dev_set_mtu/__netif_set_mtu/Stanislav Fomichev
Commit cc34acd577f1 ("docs: net: document new locking reality") introduced netif_ vs dev_ function semantics: the former expects locked netdev, the latter takes care of the locking. We don't strictly follow this semantics on either side, but there are more dev_xxx handlers now that don't fit. Rename them to netif_xxx where appropriate. __netif_set_mtu is used only by bond, so move it into NETDEV_INTERNAL namespace. Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250717172333.1288349-5-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-18net: s/dev_pre_changeaddr_notify/netif_pre_changeaddr_notify/Stanislav Fomichev
Commit cc34acd577f1 ("docs: net: document new locking reality") introduced netif_ vs dev_ function semantics: the former expects locked netdev, the latter takes care of the locking. We don't strictly follow this semantics on either side, but there are more dev_xxx handlers now that don't fit. Rename them to netif_xxx where appropriate. netif_pre_changeaddr_notify is used only by ipvlan/bond, so move it into NETDEV_INTERNAL namespace. Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250717172333.1288349-4-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-18net: s/dev_get_mac_address/netif_get_mac_address/Stanislav Fomichev
Commit cc34acd577f1 ("docs: net: document new locking reality") introduced netif_ vs dev_ function semantics: the former expects locked netdev, the latter takes care of the locking. We don't strictly follow this semantics on either side, but there are more dev_xxx handlers now that don't fit. Rename them to netif_xxx where appropriate. netif_get_mac_address is used only by tun/tap, so move it into NETDEV_INTERNAL namespace. Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250717172333.1288349-3-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-18net: s/dev_get_port_parent_id/netif_get_port_parent_id/Stanislav Fomichev
Commit cc34acd577f1 ("docs: net: document new locking reality") introduced netif_ vs dev_ function semantics: the former expects locked netdev, the latter takes care of the locking. We don't strictly follow this semantics on either side, but there are more dev_xxx handlers now that don't fit. Rename them to netif_xxx where appropriate. Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250717172333.1288349-2-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-18net: track pfmemalloc drops via SKB_DROP_REASON_PFMEMALLOCJesper Dangaard Brouer
Add a new SKB drop reason (SKB_DROP_REASON_PFMEMALLOC) to track packets dropped due to memory pressure. In production environments, we've observed memory exhaustion reported by memory layer stack traces, but these drops were not properly tracked in the SKB drop reason infrastructure. While most network code paths now properly report pfmemalloc drops, some protocol-specific socket implementations still use sk_filter() without drop reason tracking: - Bluetooth L2CAP sockets - CAIF sockets - IUCV sockets - Netlink sockets - SCTP sockets - Unix domain sockets These remaining cases represent less common paths and could be converted in a follow-up patch if needed. The current implementation provides significantly improved observability into memory pressure events in the network stack, especially for key protocols like TCP and UDP, helping to diagnose problems in production environments. Reported-by: Matt Fleming <mfleming@cloudflare.com> Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://patch.msgid.link/175268316579.2407873.11634752355644843509.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-18Merge branch 'for-6.17/cxl-events-updates' into cxl-for-nextDave Jiang
Update Common Event Record to CXL r3.2 definition. Add additional validity check for event records. Add memory sparing event record tracing.
2025-07-18iommufd: Rename some shortterm-related identifiersXu Yilun
Rename the shortterm-related identifiers to wait-related. The usage of shortterm_users refcount is now beyond its name. It is also used for references which live longer than an ioctl execution. E.g. vdev holds idev's shortterm_users refcount on vdev allocation, releases it during idev's pre_destroy(). Rename the refcount as wait_cnt, since it is always used to sync the referencing & the destruction of the object by waiting for it to go to zero. List all changed identifiers: iommufd_object::shortterm_users -> iommufd_object::wait_cnt REMOVE_WAIT_SHORTTERM -> REMOVE_WAIT iommufd_object_dec_wait_shortterm() -> iommufd_object_dec_wait() zerod_shortterm -> zerod_wait_cnt No functional change intended. Link: https://patch.msgid.link/r/20250716070349.1807226-9-yilun.xu@linux.intel.com Suggested-by: Kevin Tian <kevin.tian@intel.com> Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-18iommufd/vdevice: Remove struct device reference from struct vdeviceXu Yilun
Remove struct device *dev from struct vdevice. The dev pointer is the Plan B for vdevice to reference the physical device. As now vdev->idev is added without refcounting concern, just use vdev->idev->dev when needed. To avoid exposing struct iommufd_device in the public header, export a iommufd_vdevice_to_device() helper. Link: https://patch.msgid.link/r/20250716070349.1807226-6-yilun.xu@linux.intel.com Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Co-developed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-18iommufd: Destroy vdevice on idevice destroyXu Yilun
Destroy iommufd_vdevice (vdev) on iommufd_idevice (idev) destruction so that vdev can't outlive idev. idev represents the physical device bound to iommufd, while the vdev represents the virtual instance of the physical device in the VM. The lifecycle of the vdev should not be longer than idev. This doesn't cause real problem on existing use cases cause vdev doesn't impact the physical device, only provides virtualization information. But to extend vdev for Confidential Computing (CC), there are needs to do secure configuration for the vdev, e.g. TSM Bind/Unbind. These configurations should be rolled back on idev destroy, or the external driver (VFIO) functionality may be impact. The idev is created by external driver so its destruction can't fail. The idev implements pre_destroy() op to actively remove its associated vdev before destroying itself. There are 3 cases on idev pre_destroy(): 1. vdev is already destroyed by userspace. No extra handling needed. 2. vdev is still alive. Use iommufd_object_tombstone_user() to destroy vdev and tombstone the vdev ID. 3. vdev is being destroyed by userspace. The vdev ID is already freed, but vdev destroy handler is not completed. This requires multi-threads syncing - vdev holds idev's short term users reference until vdev destruction completes, idev leverages existing wait_shortterm mechanism for syncing. idev should also block any new reference to it after pre_destroy(), or the following wait shortterm would timeout. Introduce a 'destroying' flag, set it to true on idev pre_destroy(). Any attempt to reference idev should honor this flag under the protection of idev->igroup->lock. Link: https://patch.msgid.link/r/20250716070349.1807226-5-yilun.xu@linux.intel.com Originally-by: Nicolin Chen <nicolinc@nvidia.com> Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Co-developed-by: "Aneesh Kumar K.V (Arm)" <aneesh.kumar@kernel.org> Signed-off-by: "Aneesh Kumar K.V (Arm)" <aneesh.kumar@kernel.org> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-07-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after rc6Alexei Starovoitov
Cross-merge BPF and other fixes after downstream PR. No conflicts. Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-07-18io_uring/cmd: remove struct io_uring_cmd_dataCaleb Sander Mateos
There are no more users of struct io_uring_cmd_data and its op_data field. Remove it to shave 8 bytes from struct io_async_cmd and eliminate a store and load for every uring_cmd. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Acked-by: David Sterba <dsterba@suse.com> Link: https://lore.kernel.org/r/20250708202212.2851548-5-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-07-18io_uring/cmd: introduce IORING_URING_CMD_REISSUE flagCaleb Sander Mateos
Add a flag IORING_URING_CMD_REISSUE that ->uring_cmd() implementations can use to tell whether this is the first or subsequent issue of the uring_cmd. This will allow ->uring_cmd() implementations to store information in the io_uring_cmd's pdu across issues. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Acked-by: David Sterba <dsterba@suse.com> Link: https://lore.kernel.org/r/20250708202212.2851548-3-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-07-18Merge tag 'phy-fix-6.16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy Pull phy fixes from Vinod Koul: "Core: - use per-PHY lockdep keys, in order to fix a phy using internal phys Drivers: - tegra: - fixes for unbalanced regulator - decouple pad calibration fix - disable periodic updates - qualcomm: - error code fix for driver probe" * tag 'phy-fix-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: phy: qcom: fix error code in snps_eusb2_hsphy_probe() phy: use per-PHY lockdep keys phy: tegra: xusb: Fix unbalanced regulator disable in UTMI PHY mode phy: tegra: xusb: Disable periodic tracking on Tegra234 phy: tegra: xusb: Decouple CYA_TRK_CODE_UPDATE_ON_IDLE from trk_hw_mode
2025-07-18cxl/events: Trace Memory Sparing Event RecordShiju Jose
CXL rev 3.2 section 8.2.10.2.1.4 Table 8-60 defines the Memory Sparing Event Record. Determine if the event read is memory sparing record and if so trace the record. Memory device shall produce a memory sparing event record 1. After completion of a PPR maintenance operation if the memory sparing event record enable bit is set (Field: sPPR/hPPR Operation Mode in Table 8-128/Table 8-131). 2. In response to a query request by the host (see section 8.2.10.7.1.4) to determine the availability of sparing resources. The device shall report the resource availability by producing the Memory Sparing Event Record (see Table 8-60) in which the channel, rank, nibble mask, bank group, bank, row, column, sub-channel fields are a copy of the values specified in the request. If the controller does not support reporting whether a resource is available, and a perform maintenance operation for memory sparing is issued with query resources set to 1, the controller shall return invalid input. Example trace log for produce memory sparing event record on completion of a soft PPR operation, cxl_memory_sparing: memdev=mem1 host=0000:0f:00.0 serial=3 log=Informational : time=55045163029 uuid=e71f3a40-2d29-4092-8a39-4d1c966c7c65 len=128 flags='0x1' handle=1 related_handle=0 maint_op_class=2 maint_op_sub_class=1 ld_id=0 head_id=0 : flags='' result=0 validity_flags='CHANNEL|RANK|NIBBLE|BANK GROUP|BANK|ROW|COLUMN' spare resource avail=1 channel=2 rank=5 nibble_mask=a59c bank_group=2 bank=4 row=13 column=23 sub_channel=0 comp_id=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 comp_id_pldm_valid_flags='' pldm_entity_id=0x00 pldm_resource_id=0x00 Note: For memory sparing event record, fields 'maintenance operation class' and 'maintenance operation subclass' are defined twice, first in the common event record (Table 8-55) and second in the memory sparing event record (Table 8-60). Thus those in the sparing event record coded as reserved, to be removed when the spec is updated. Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Link: https://patch.msgid.link/20250717101817.2104-5-shiju.jose@huawei.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-07-18cxl/events: Update Common Event Record to CXL spec rev 3.2Shiju Jose
CXL spec 3.2 section 8.2.10.2.1 Table 8-55, Common Event Record format defined new fields LD-ID and Head ID. LD-ID: ID of logical device from where the event originated, which is valid only if LD-ID valid flag is set to 1. CXL spec 3.2 Section 2.4 describes, a Type 3 Multi-Logical Device (MLD) can partition its resources into up to 16 isolated Logical Devices. Each Logical Device is identified by a Logical Device Identifier (LD-ID) in CXL.mem and CXL.io protocols. LD-ID is a 16-bit Logical Device identifier applicable for CXL.io and CXL.mem requests and responses. CXL.mem supports only the lower 4 bits of LD-ID and therefore can support up to 16 unique LD-ID values over the link. Requests and responses forwarded over an MLD Port are tagged with LD-ID. Head ID: ID of the device head, from where the event originated, which is valid only if head valid flag is set to 1. Add updates for the above spec changes in the CXL events record and CXL common trace event implementation. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Link: https://patch.msgid.link/20250717101817.2104-2-shiju.jose@huawei.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-07-18wifi: mac80211: support initialising an S1G short beaconing BSSLachlan Hodges
Introduce the ability to parse the short beacon data and long beacon period. The long beacon period represents the number of beacon intervals between each long beacon transmission. Additionally, as a BSS cannot change its configuration such that short beaconing is dynamically disabled/enabled without tearing down the interface - we ensure we have an existing short beacon before performing the update. Signed-off-by: Lachlan Hodges <lachlan.hodges@morsemicro.com> Link: https://patch.msgid.link/20250717074205.312577-3-lachlan.hodges@morsemicro.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-18wifi: cfg80211: support configuring an S1G short beaconing BSSLachlan Hodges
S1G short beacons are an optional frame type used in an S1G BSS that contain a limited set of elements. While they are optional, they are a fundamental part of S1G that enables significant power saving. Expose 2 additional netlink attributes, NL80211_ATTR_S1G_LONG_BEACON_PERIOD which denotes the number of beacon intervals between each long beacon and NL80211_ATTR_S1G_SHORT_BEACON which is a nested attribute containing the short beacon tail and head. We split them as the long beacon period cannot be updated, and is only used when initialisng the interface, whereas the short beacon data can be used to both initialise and update the templates. This follows how things such as the beacon interval and DTIM period currently operate. During the initialisation path, we ensure we have the long beacon period if the short beacon data is being passed down, whereas the update path will simply update the template if its sent down. The short beacon data is validated using the same routines for regular beacons as they support correctly parsing the short beacon format while ensuring the frame is well-formed. Signed-off-by: Lachlan Hodges <lachlan.hodges@morsemicro.com> Link: https://patch.msgid.link/20250717074205.312577-2-lachlan.hodges@morsemicro.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-18wifi: brcmfmac: Add support for the SDIO 43751 deviceFabio Estevam
Add the SDIO ID and firmware matching for the 43751 device. Based on the previous work from Marc Gonzalez <mgonzalez@freebox.fr>. Tested on an i.MX6DL board connected to an AP6398SV chip with the brcmfmac43752-sdio.bin firmware taken from: https://source.puri.sm/Librem5/firmware-brcm43752-nonfree Signed-off-by: Fabio Estevam <festevam@gmail.com> Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>> Link: https://patch.msgid.link/20250712215307.1310802-1-festevam@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-07-18vdso/gettimeofday: Add support for auxiliary clocksThomas Weißschuh
Expose the auxiliary clocks through the vDSO. Architectures not using the generic vDSO time framework, namely SPARC64, are not supported. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250701-vdso-auxclock-v1-12-df7d9f87b9b8@linutronix.de
2025-07-18vdso/vsyscall: Update auxiliary clock data in the datapageThomas Weißschuh
Expose the auxiliary clock data so it can be read from the vDSO. Architectures not using the generic vDSO time framework, namely SPARC64, are not supported. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250701-vdso-auxclock-v1-11-df7d9f87b9b8@linutronix.de
2025-07-18vdso: Introduce aux_clock_resolution_ns()Thomas Weißschuh
Move the constant resolution to a shared header, so the vDSO can use it and return it without going through a syscall. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250701-vdso-auxclock-v1-10-df7d9f87b9b8@linutronix.de
2025-07-18crypto: engine - remove {prepare,unprepare}_crypt_hardware callbacksOvidiu Panait
The {prepare,unprepare}_crypt_hardware callbacks were added back in 2016 by commit 735d37b5424b ("crypto: engine - Introduce the block request crypto engine framework"), but they were never implemented by any driver. Remove them as they are unused. Since the 'engine->idling' and 'was_busy' flags are no longer needed, remove them as well. Signed-off-by: Ovidiu Panait <ovidiu.panait.oss@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-07-18crypto: engine - remove request batching supportOvidiu Panait
Remove request batching support from crypto_engine, as there are no drivers using this feature and it doesn't really work that well. Instead of doing batching based on backlog, a more optimal approach would be for the user to handle the batching (similar to how IPsec can hook into GSO to get 64K of data each time or how block encryption can use unit sizes much greater than 4K). Suggested-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Ovidiu Panait <ovidiu.panait.oss@gmail.com> Reviewed-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-07-18crypto: acomp - Fix CFI failure due to type punningEric Biggers
To avoid a crash when control flow integrity is enabled, make the workspace ("stream") free function use a consistent type, and call it through a function pointer that has that same type. Fixes: 42d9f6c77479 ("crypto: acomp - Move scomp stream allocation code into acomp") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-07-18Merge tag 'local-lock-for-net' of ↵Herbert Xu
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into head Local lock changes required by net/crypto
2025-07-18fs: constify file ptr in backing_file accessor helpersAmir Goldstein
Add internal helper backing_file_set_user_path() for the only two cases that need to modify backing_file fields. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/20250607115304.2521155-2-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-17cleanup: Fix documentation build error for ACQUIRE updatesDan Williams
Stephen reports: Documentation/core-api/cleanup:7: include/linux/cleanup.h:73: ERROR: Unexpected indentation. [docutils] Documentation/core-api/cleanup:7: include/linux/cleanup.h:74: WARNING: Block quote ends without a blank line; unexpected unindent. [docutils] Which points out that the ACQUIRE() example in cleanup.h missed the "::" suffix to mark the following text as a code-block. Fixes: 857d18f23ab1 ("cleanup: Introduce ACQUIRE() and ACQUIRE_ERR() for conditional locks") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: http://lore.kernel.org/20250717173354.34375751@canb.auug.org.au Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Link: https://patch.msgid.link/20250717163036.1275791-1-dan.j.williams@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-07-17string: Group str_has_prefix() and strstarts()Andy Shevchenko
The two str_has_prefix() and strstarts() are about the same with a slight difference on what they return. Group them in the header. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20250711085514.1294428-1-andriy.shevchenko@linux.intel.com Signed-off-by: Kees Cook <kees@kernel.org>
2025-07-17neighbour: Update pneigh_entry in pneigh_create().Kuniyuki Iwashima
neigh_add() updates pneigh_entry() found or created by pneigh_create(). This update is serialised by RTNL, but we will remove it. Let's move the update part to pneigh_create() and make it return errno instead of a pointer of pneigh_entry. Now, the pneigh code is RTNL free. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250716221221.442239-16-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17neighbour: Protect tbl->phash_buckets[] with a dedicated mutex.Kuniyuki Iwashima
tbl->phash_buckets[] is only modified in the slow path by pneigh_create() and pneigh_delete() under the table lock. Both of them are called under RTNL, so no extra lock is needed, but we will remove RTNL from the paths. pneigh_create() looks up a pneigh_entry, and this part can be lockless, but it would complicate the logic like 1. lookup 2. allocate pengih_entry for GFP_KERNEL 3. lookup again but under lock 4. if found, return it after freeing the allocated memory 5. else, return the new one Instead, let's add a per-table mutex and run lookup and allocation under it. Note that updating pneigh_entry part in neigh_add() is still protected by RTNL and will be moved to pneigh_create() in the next patch. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250716221221.442239-15-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17neighbour: Remove __pneigh_lookup().Kuniyuki Iwashima
__pneigh_lookup() is the lockless version of pneigh_lookup(), but its only caller pndisc_is_router() holds the table lock and reads pneigh_netry.flags. This is because accessing pneigh_entry after pneigh_lookup() was illegal unless the caller holds RTNL or the table lock. Now, pneigh_entry is guaranteed to be alive during the RCU critical section. Let's call pneigh_lookup() and use READ_ONCE() for n->flags in pndisc_is_router() and remove __pneigh_lookup(). Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250716221221.442239-13-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17neighbour: Free pneigh_entry after RCU grace period.Kuniyuki Iwashima
We will convert RTM_GETNEIGH to RCU. neigh_get() looks up pneigh_entry by pneigh_lookup() and passes it to pneigh_fill_info(). Then, we must ensure that the entry is alive till pneigh_fill_info() completes, but read_lock_bh(&tbl->lock) in pneigh_lookup() does not guarantee that. Also, we will convert all readers of tbl->phash_buckets[] to RCU. Let's use call_rcu() to free pneigh_entry and update phash_buckets[] and ->next by rcu_assign_pointer(). pneigh_ifdown_and_unlock() uses list_head to avoid overwriting ->next and moving RCU iterators to another list. pndisc_destructor() (only IPv6 ndisc uses this) uses a mutex, so it is not delayed to call_rcu(), where we cannot sleep. This is fine because the mcast code works with RCU and ipv6_dev_mc_dec() frees mcast objects after RCU grace period. While at it, we change the return type of pneigh_ifdown_and_unlock() to void. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250716221221.442239-8-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17neighbour: Annotate neigh_table.phash_buckets and pneigh_entry.next with __rcu.Kuniyuki Iwashima
The next patch will free pneigh_entry with call_rcu(). Then, we need to annotate neigh_table.phash_buckets[] and pneigh_entry.next with __rcu. To make the next patch cleaner, let's annotate the fields in advance. Currently, all accesses to the fields are under the neigh table lock, so rcu_dereference_protected() is used with 1 for now, but most of them (except in pneigh_delete() and pneigh_ifdown_and_unlock()) will be replaced with rcu_dereference() and rcu_dereference_check(). Note that pneigh_ifdown_and_unlock() changes pneigh_entry.next to a local list, which is illegal because the RCU iterator could be moved to another list. This part will be fixed in the next patch. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250716221221.442239-7-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17neighbour: Split pneigh_lookup().Kuniyuki Iwashima
pneigh_lookup() has ASSERT_RTNL() in the middle of the function, which is confusing. When called with the last argument, creat, 0, pneigh_lookup() literally looks up a proxy neighbour entry. This is the case of the reader path as the fast path and RTM_GETNEIGH. pneigh_lookup(), however, creates a pneigh_entry when called with creat 1 from RTM_NEWNEIGH and SIOCSARP, which require RTNL. Let's split pneigh_lookup() into two functions. We will convert all the reader paths to RCU, and read_lock_bh(&tbl->lock) in the new pneigh_lookup() will be dropped. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250716221221.442239-6-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17ethtool: rss: initial RSS_SET (indirection table handling)Jakub Kicinski
Add initial support for RSS_SET, for now only operations on the indirection table are supported. Unlike the ioctl don't check if at least one parameter is being changed. This is how other ethtool-nl ops behave, so pick the ethtool-nl consistency vs copying ioctl behavior. There are two special cases here: 1) resetting the table to defaults; 2) support for tables of different size. For (1) I use an empty Netlink attribute (array of size 0). (2) may require some background. AFAICT a lot of modern devices allow allocating RSS tables of different sizes. mlx5 can upsize its tables, bnxt has some "table size calculation", and Intel folks asked about RSS table sizing in context of resource allocation in the past. The ethtool IOCTL API has a concept of table size, but right now the user is expected to provide a table exactly the size the device requests. Some drivers may change the table size at runtime (in response to queue count changes) but the user is not in control of this. What's not great is that all RSS contexts share the same table size. For example a device with 128 queues enabled, 16 RSS contexts 8 queues in each will likely have 256 entry tables for each of the 16 contexts, while 32 would be more than enough given each context only has 8 queues. To address this the Netlink API should avoid enforcing table size at the uAPI level, and should allow the user to express the min table size they expect. To fully solve (2) we will need more driver plumbing but at the uAPI level this patch allows the user to specify a table size smaller than what the device advertises. The device table size must be a multiple of the user requested table size. We then replicate the user-provided table to fill the full device size table. This addresses the "allow the user to express the min table size" objective, while not enforcing any fixed size. From Netlink perspective .get_rxfh_indir_size() is now de facto the "max" table size supported by the device. We may choose to support table replication in ethtool, too, when we actually plumb this thru the device APIs. Initially I was considering moving full pattern generation to the kernel (which queues to use, at which frequency and what min sequence length). I don't think this complexity would buy us much and most if not all devices have pow-2 table sizes, which simplifies the replication a lot. Reviewed-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250716000331.1378807-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17PCI: Add pci_is_display() to check if device is a display controllerMario Limonciello
Several places in the kernel do class shifting to match whether a PCI device is display class. Add pci_is_display() for those places to use. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Daniel Dadap <ddadap@nvidia.com> Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch> Link: https://patch.msgid.link/20250717173812.3633478-2-superm1@kernel.org
2025-07-17stop_machine: Improve kernel-doc function-header commentsPaul E. McKenney
Add more detail to the kernel-doc function-header comments for stop_machine(), stop_machine_cpuslocked(), and stop_core_cpuslocked(). Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2025-07-17Merge back earlier cpufreq material for 6.17-rc1Rafael J. Wysocki
2025-07-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.16-rc7). Conflicts: Documentation/netlink/specs/ovpn.yaml 880d43ca9aa4 ("netlink: specs: clean up spaces in brackets") af52020fc599 ("ovpn: reject unexpected netlink attributes") drivers/net/phy/phy_device.c a44312d58e78 ("net: phy: Don't register LEDs for genphy") f0f2b992d818 ("net: phy: Don't register LEDs for genphy") https://lore.kernel.org/20250710114926.7ec3a64f@kernel.org drivers/net/wireless/intel/iwlwifi/fw/regulatory.c drivers/net/wireless/intel/iwlwifi/mld/regulatory.c 5fde0fcbd760 ("wifi: iwlwifi: mask reserved bits in chan_state_active_bitmap") ea045a0de3b9 ("wifi: iwlwifi: add support for accepting raw DSM tables by firmware") net/ipv6/mcast.c ae3264a25a46 ("ipv6: mcast: Delay put pmc->idev in mld_del_delrec()") a8594c956cc9 ("ipv6: mcast: Avoid a duplicate pointer check in mld_del_delrec()") https://lore.kernel.org/8cc52891-3653-4b03-a45e-05464fe495cf@kernel.org No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-17Merge back earlier material related to system sleepRafael J. Wysocki
2025-07-17cgroup: llist: avoid memory tears for llist_nodeShakeel Butt
Before the commit 36df6e3dbd7e ("cgroup: make css_rstat_updated nmi safe"), the struct llist_node is expected to be private to the one inserting the node to the lockless list or the one removing the node from the lockless list. After the mentioned commit, the llist_node in the rstat code is per-cpu shared between the stacked contexts i.e. process, softirq, hardirq & nmi. It is possible the compiler may tear the loads or stores of llist_node. Let's avoid that. KCSAN reported the following race: Reported by Kernel Concurrency Sanitizer on: CPU: 60 UID: 0 PID: 5425 ... 6.16.0-rc3-next-20250626 #1 NONE Tainted: [E]=UNSIGNED_MODULE Hardware name: ... ================================================================== ================================================================== BUG: KCSAN: data-race in css_rstat_flush / css_rstat_updated write to 0xffffe8fffe1c85f0 of 8 bytes by task 1061 on cpu 1: css_rstat_flush+0x1b8/0xeb0 __mem_cgroup_flush_stats+0x184/0x190 flush_memcg_stats_dwork+0x22/0x50 process_one_work+0x335/0x630 worker_thread+0x5f1/0x8a0 kthread+0x197/0x340 ret_from_fork+0xd3/0x110 ret_from_fork_asm+0x11/0x20 read to 0xffffe8fffe1c85f0 of 8 bytes by task 3551 on cpu 15: css_rstat_updated+0x81/0x180 mod_memcg_lruvec_state+0x113/0x2d0 __mod_lruvec_state+0x3d/0x50 lru_add+0x21e/0x3f0 folio_batch_move_lru+0x80/0x1b0 __folio_batch_add_and_move+0xd7/0x160 folio_add_lru_vma+0x42/0x50 do_anonymous_page+0x892/0xe90 __handle_mm_fault+0xfaa/0x1520 handle_mm_fault+0xdc/0x350 do_user_addr_fault+0x1dc/0x650 exc_page_fault+0x5c/0x110 asm_exc_page_fault+0x22/0x30 value changed: 0xffffe8fffe18e0d0 -> 0xffffe8fffe1c85f0 $ ./scripts/faddr2line vmlinux css_rstat_flush+0x1b8/0xeb0 css_rstat_flush+0x1b8/0xeb0: init_llist_node at include/linux/llist.h:86 (inlined by) llist_del_first_init at include/linux/llist.h:308 (inlined by) css_process_update_tree at kernel/cgroup/rstat.c:148 (inlined by) css_rstat_updated_list at kernel/cgroup/rstat.c:258 (inlined by) css_rstat_flush at kernel/cgroup/rstat.c:389 $ ./scripts/faddr2line vmlinux css_rstat_updated+0x81/0x180 css_rstat_updated+0x81/0x180: css_rstat_updated at kernel/cgroup/rstat.c:90 (discriminator 1) These are expected race and a simple READ_ONCE/WRITE_ONCE resolves these reports. However let's add comments to explain the race and the need for memory barriers if stronger guarantees are needed. More specifically the rstat updater and the flusher can race and cause a scenario where the stats updater skips adding the css to the lockless list but the flusher might not see those updates done by the skipped updater. This is benign race and the subsequent flusher will flush those stats and at the moment there aren't any rstat users which are not fine with this kind of race. However some future user might want more stricter guarantee, so let's add appropriate comments to ease the job of future users. Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Fixes: 36df6e3dbd7e ("cgroup: make css_rstat_updated nmi safe") Signed-off-by: Tejun Heo <tj@kernel.org>
2025-07-17Merge tag 'net-6.16-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from Bluetooth, CAN, WiFi and Netfilter. More code here than I would have liked. That said, better now than next week. Nothing particularly scary stands out. The improvement to the OpenVPN input validation is a bit large but better get them in before the code makes it to a final release. Some of the changes we got from sub-trees could have been split better between the fix and -next refactoring, IMHO, that has been communicated. We have one known regression in a TI AM65 board not getting link. The investigation is going a bit slow, a number of people are on vacation. We'll try to wrap it up, but don't think it should hold up the release. Current release - fix to a fix: - Bluetooth: L2CAP: fix attempting to adjust outgoing MTU, it broke some headphones and speakers Current release - regressions: - wifi: ath12k: fix packets received in WBM error ring with REO LUT enabled, fix Rx performance regression - wifi: iwlwifi: - fix crash due to a botched indexing conversion - mask reserved bits in chan_state_active_bitmap, avoid FW assert() Current release - new code bugs: - nf_conntrack: fix crash due to removal of uninitialised entry - eth: airoha: fix potential UaF in airoha_npu_get() Previous releases - regressions: - net: fix segmentation after TCP/UDP fraglist GRO - af_packet: fix the SO_SNDTIMEO constraint not taking effect and a potential soft lockup waiting for a completion - rpl: fix UaF in rpl_do_srh_inline() for sneaky skb geometry - virtio-net: fix recursive rtnl_lock() during probe() - eth: stmmac: populate entire system_counterval_t in get_time_fn() - eth: libwx: fix a number of crashes in the driver Rx path - hv_netvsc: prevent IPv6 addrconf after IFF_SLAVE lost that meaning Previous releases - always broken: - mptcp: fix races in handling connection fallback to pure TCP - rxrpc: assorted error handling and race fixes - sched: another batch of "security" fixes for qdiscs (QFQ, HTB) - tls: always refresh the queue when reading sock, avoid UaF - phy: don't register LEDs for genphy, avoid deadlock - Bluetooth: btintel: check if controller is ISO capable on btintel_classify_pkt_type(), work around FW returning incorrect capabilities Misc: - make OpenVPN Netlink input checking more strict before it makes it to a final release - wifi: cfg80211: remove scan request n_channels __counted_by, it's only yielding false positives" * tag 'net-6.16-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (66 commits) rxrpc: Fix to use conn aborts for conn-wide failures rxrpc: Fix transmission of an abort in response to an abort rxrpc: Fix notification vs call-release vs recvmsg rxrpc: Fix recv-recv race of completed call rxrpc: Fix irq-disabled in local_bh_enable() selftests/tc-testing: Test htb_dequeue_tree with deactivation and row emptying net/sched: Return NULL when htb_lookup_leaf encounters an empty rbtree net: bridge: Do not offload IGMP/MLD messages selftests: Add test cases for vlan_filter modification during runtime net: vlan: fix VLAN 0 refcount imbalance of toggling filtering during runtime tls: always refresh the queue when reading sock virtio-net: fix recursived rtnl_lock() during probe() net/mlx5: Update the list of the PCI supported devices hv_netvsc: Set VF priv_flags to IFF_NO_ADDRCONF before open to prevent IPv6 addrconf phonet/pep: Move call to pn_skb_get_dst_sockaddr() earlier in pep_sock_accept() Bluetooth: L2CAP: Fix attempting to adjust outgoing MTU netfilter: nf_conntrack: fix crash due to removal of uninitialised entry net: fix segmentation after TCP/UDP fraglist GRO ipv6: mcast: Delay put pmc->idev in mld_del_delrec() net: airoha: fix potential use-after-free in airoha_npu_get() ...