linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-10-16	timers: Rename usleep_idle_range() to usleep_range_idle()	Anna-Maria Behnsen
	usleep_idle_range() is a variant of usleep_range(). Both are using usleep_range_state() as a base. To be able to find all the related functions in one go, rename it usleep_idle_range() to usleep_range_idle(). No functional change. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: SeongJae Park <sj@kernel.org> Link: https://lore.kernel.org/all/20241014-devel-anna-maria-b4-timers-flseep-v3-4-dc8b907cb62f@linutronix.de
2024-10-15	ftrace: Rename ftrace_regs_return_value to ftrace_regs_get_return_value	Masami Hiramatsu (Google)
	Rename ftrace_regs_return_value to ftrace_regs_get_return_value as same as other ftrace_regs_get/set_* APIs. arm64 and riscv are already using this new name. Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Florent Revest <revest@chromium.org> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: bpf <bpf@vger.kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/172895573350.107311.7564634260652361511.stgit@devnote2 Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-15	ftrace: Use arch_ftrace_regs() for ftrace_regs_*() macros	Masami Hiramatsu (Google)
	Since the arch_ftrace_get_regs(fregs) is only valid when the FL_SAVE_REGS is set, we need to use `&arch_ftrace_regs()->regs` for ftrace_regs_*() APIs because those APIs are for ftrace_regs, not complete pt_regs. Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com> Cc: Florent Revest <revest@chromium.org> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: bpf <bpf@vger.kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Alan Maguire <alan.maguire@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/172895572290.107311.16057631001860177198.stgit@devnote2 Fixes: e4cf33ca4812 ("ftrace: Consolidate ftrace_regs accessor functions for archs using pt_regs") Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-15	power: supply: core: remove {,devm_}power_supply_register_no_ws()	Thomas Weißschuh
	The same functionality is available through power_supply_config::no_wakeup_source, which is more idiomatic. All users of the old API have been converted. Also remove the argument "ws" from __power_supply_register(), as it is now always "true". Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Tzung-Bi Shih <tzungbi@kernel.org> Link: https://lore.kernel.org/r/20241005-power-supply-no-wakeup-source-v1-8-1d62bf9bcb1d@weissschuh.net Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2024-10-15	power: supply: core: add wakeup source inhibit by power_supply_config	Thomas Weißschuh
	To inhibit wakeup users currently have to use dedicated functions to register the power supply: {,devm_}power_supply_register_no_ws(). This is inconsistent to other runtime settings which can be configured through struct power_supply_config. It's also not obvious what _no_ws() is meant to mean. Extend power_supply_config to also be able to inhibit the wakeup source. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Tzung-Bi Shih <tzungbi@kernel.org> Link: https://lore.kernel.org/r/20241005-power-supply-no-wakeup-source-v1-1-1d62bf9bcb1d@weissschuh.net Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2024-10-15	power: supply: core: constify power_supply_battery_info::ocv_table	Thomas Weißschuh
	The power supply core never modifies the ocv table. Reflect this in the API, so drivers can mark their static tables as const. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20241005-power-supply-battery-const-v1-5-c1f721927048@weissschuh.net Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2024-10-15	power: supply: core: constify power_supply_battery_info::resist_table	Thomas Weißschuh
	The power supply core never modifies the resist table. Reflect this in the API, so drivers can mark their static tables as const. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20241005-power-supply-battery-const-v1-1-c1f721927048@weissschuh.net Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2024-10-15	tpm: fix unsigned/signed mismatch errors related to __calc_tpm2_event_size	Gregory Price
	__calc_tpm2_event_size returns 0 or a positive length, but return values are often interpreted as ints. Convert everything over to u32 to avoid signed/unsigned logic errors. Signed-off-by: Gregory Price <gourry@gourry.net> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-10-15	Merge tag 'bcachefs-2024-10-14' of git://evilpiepirate.org/bcachefs	Linus Torvalds
	Pull bcachefs fixes from Kent Overstreet: - New metadata version inode_has_child_snapshots This fixes bugs with handling of unlinked inodes + snapshots, in particular when an inode is reattached after taking a snapshot; deleted inodes now get correctly cleaned up across snapshots. - Disk accounting rewrite fixes - validation fixes for when a device has been removed - fix journal replay failing with "journal_reclaim_would_deadlock" - Some more small fixes for erasure coding + device removal - Assorted small syzbot fixes * tag 'bcachefs-2024-10-14' of git://evilpiepirate.org/bcachefs: (27 commits) bcachefs: Fix sysfs warning in fstests generic/730,731 bcachefs: Handle race between stripe reuse, invalidate_stripe_to_dev bcachefs: Fix kasan splat in new_stripe_alloc_buckets() bcachefs: Add missing validation for bch_stripe.csum_granularity_bits bcachefs: Fix missing bounds checks in bch2_alloc_read() bcachefs: fix uaf in bch2_dio_write_done() bcachefs: Improve check_snapshot_exists() bcachefs: Fix bkey_nocow_lock() bcachefs: Fix accounting replay flags bcachefs: Fix invalid shift in member_to_text() bcachefs: Fix bch2_have_enough_devs() for BCH_SB_MEMBER_INVALID bcachefs: __wait_for_freeing_inode: Switch to wait_bit_queue_entry bcachefs: Check if stuck in journal_res_get() closures: Add closure_wait_event_timeout() bcachefs: Fix state lock involved deadlock bcachefs: Fix NULL pointer dereference in bch2_opt_to_text bcachefs: Release transaction before wake up bcachefs: add check for btree id against max in try read node bcachefs: Disk accounting device validation fixes bcachefs: bch2_inode_or_descendents_is_open() ...
2024-10-15	gpu: host1x: Set up device DMA parameters	Thierry Reding
	In order to store device DMA parameters, the DMA framework depends on the device's dma_parms field to point at a valid memory location. Add backing storage for this in struct host1x_memory_context and point to it. Reported-by: Jonathan Hunter <jonathanh@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240916133320.368620-1-thierry.reding@gmail.com (cherry picked from commit b4ad4ef374d66cc8df3188bb1ddb65bce5fc9e50) Signed-off-by: Thierry Reding <treding@nvidia.com>
2024-10-15	ftrace: Consolidate ftrace_regs accessor functions for archs using pt_regs	Steven Rostedt
	Most architectures use pt_regs within ftrace_regs making a lot of the accessor functions just calls to the pt_regs internally. Instead of duplication this effort, use a HAVE_ARCH_FTRACE_REGS for architectures that have their own ftrace_regs that is not based on pt_regs and will define all the accessor functions, and for the architectures that just use pt_regs, it will leave it undefined, and the default accessor functions will be used. Note, this will also make it easier to add new accessor functions to ftrace_regs as it will mean having to touch less architectures. Cc: <linux-arch@vger.kernel.org> Cc: "x86@kernel.org" <x86@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Naveen N Rao <naveen@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/20241010202114.2289f6fd@gandalf.local.home Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390 Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> # powerpc Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-15	debugobjects: Prepare for batching	Thomas Gleixner
	Move the debug_obj::object pointer into a union and add a pointer to the last node in a batch. That allows to implement batch processing efficiently by utilizing the stack property of hlist: When the first object of a batch is added to the list, then the batch pointer is set to the hlist node of the object itself. Any subsequent add retrieves the pointer to the last node from the first object in the list and uses that for storing the last node pointer in the newly added object. Add the pointer to the data structure and ensure that all relevant pool sizes are strictly batch sized. The actual batching implementation follows in subsequent changes. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Link: https://lore.kernel.org/all/20241007164914.139204961@linutronix.de
2024-10-15	of/address: Constify of_busses[] array and pointers	Rob Herring (Arm)
	The of_busses array is fixed, so it and all struct of_bus pointers can be const. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20241010-dt-const-v1-7-87a51f558425@kernel.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2024-10-15	of: Constify struct property pointers	Rob Herring (Arm)
	Most accesses to struct property do not modify it, so constify struct property pointers where ever possible in the DT core code. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20241010-dt-const-v1-4-87a51f558425@kernel.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2024-10-15	of: Constify struct device_node function arguments	Rob Herring (Arm)
	Functions which don't change the refcount or otherwise modify struct device_node can make struct device_node const. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20241010-dt-const-v1-3-87a51f558425@kernel.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2024-10-15	PCI: Constify pci_register_io_range() fwnode_handle	Rob Herring (Arm)
	pci_register_io_range() does not modify the passed in fwnode_handle, so make it const. Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20241010-dt-const-v1-1-87a51f558425@kernel.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2024-10-15	iomap: remove iomap_file_buffered_write_punch_delalloc	Christoph Hellwig
	Currently iomap_file_buffered_write_punch_delalloc can be called from XFS either with the invalidate lock held or not. To fix this while keeping the locking in the file system and not the iomap library code we'll need to life the locking up into the file system. To prepare for that, open code iomap_file_buffered_write_punch_delalloc in the only caller, and instead export iomap_write_delalloc_release. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-10-15	iomap: factor out a iomap_last_written_block helper	Christoph Hellwig
	Split out a pice of logic from iomap_file_buffered_write_punch_delalloc that is useful for all iomap_end implementations. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2024-10-15	net: phy: support 'active-high' property for PHY LEDs	Daniel Golle
	In addition to 'active-low' and 'inactive-high-impedance' also support 'active-high' property for PHY LED pin configuration. As only either 'active-high' or 'active-low' can be set at the same time, WARN and return an error in case both are set. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/91598487773d768f254d5faf06cf65b13e972f0e.1728558223.git.daniel@makrotopia.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-15	iommu: Remove iommu_present()	Lu Baolu
	The last callsite of iommu_present() is removed by commit <45c690aea8ee> ("drm/tegra: Use iommu_paging_domain_alloc()"). Remove it to avoid dead code. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20241009051808.29455-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-10-15	ALSA/hda: intel-sdw-acpi: add support for sdw-manager-list property read	Pierre-Louis Bossart
	The DisCo for SoundWire 2.0 spec adds support for a new sdw-manager-list property. Add it in backwards-compatible mode with 'sdw-master-count', which assumed that all links between 0..count-1 exist. Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/20241001070611.63288-5-yung-chuan.liao@linux.intel.com
2024-10-14	bpf: Add kmem_cache iterator	Namhyung Kim
	The new "kmem_cache" iterator will traverse the list of slab caches and call attached BPF programs for each entry. It should check the argument (ctx.s) if it's NULL before using it. Now the iteration grabs the slab_mutex only if it traverse the list and releases the mutex when it runs the BPF program. The kmem_cache entry is protected by a refcount during the execution. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> #slab Link: https://lore.kernel.org/r/20241010232505.1339892-2-namhyung@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-14	net: napi: Add napi_config	Joe Damato
	Add a persistent NAPI config area for NAPI configuration to the core. Drivers opt-in to setting the persistent config for a NAPI by passing an index when calling netif_napi_add_config. napi_config is allocated in alloc_netdev_mqs, freed in free_netdev (after the NAPIs are deleted). Drivers which call netif_napi_add_config will have persistent per-NAPI settings: NAPI IDs, gro_flush_timeout, and defer_hard_irq settings. Per-NAPI settings are saved in napi_disable and restored in napi_enable. Co-developed-by: Martin Karsten <mkarsten@uwaterloo.ca> Signed-off-by: Martin Karsten <mkarsten@uwaterloo.ca> Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241011184527.16393-6-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-14	net: napi: Make gro_flush_timeout per-NAPI	Joe Damato
	Allow per-NAPI gro_flush_timeout setting. The existing sysfs parameter is respected; writes to sysfs will write to all NAPI structs for the device and the net_device gro_flush_timeout field. Reads from sysfs will read from the net_device field. The ability to set gro_flush_timeout on specific NAPI instances will be added in a later commit, via netdev-genl. Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20241011184527.16393-4-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-14	net: napi: Make napi_defer_hard_irqs per-NAPI	Joe Damato
	Add defer_hard_irqs to napi_struct in preparation for per-NAPI settings. The existing sysfs parameter is respected; writes to sysfs will write to all NAPI structs for the device and the net_device defer_hard_irq field. Reads from sysfs show the net_device field. The ability to set defer_hard_irqs on specific NAPI instances will be added in a later commit, via netdev-genl. Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20241011184527.16393-2-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-14	net: add TIME_WAIT logic to sk_to_full_sk()	Eric Dumazet
	TCP will soon attach TIME_WAIT sockets to some ACK and RST. Make sure sk_to_full_sk() detects this and does not return a non full socket. v3: also changed sk_const_to_full_sk() Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Brian Vazquez <brianvv@google.com> Link: https://patch.msgid.link/20241010174817.1543642-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-14	logic_pio: Constify fwnode_handle	Rob Herring (Arm)
	The fwnode_handle passed into find_io_range_by_fwnode() and logic_pio_trans_hwaddr() are not modified, so make them const. Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20241010-dt-const-v1-2-87a51f558425@kernel.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2024-10-14	dmaengine: acpi: Clean up headers	Andy Shevchenko
	There is a few things done: - include only the headers we are direct user of - when pointer is in use, provide a forward declaration - add missing headers - sort alphabetically Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20241007150436.2183575-4-andriy.shevchenko@linux.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2024-10-14	dmaengine: acpi: Drop unused devm_acpi_dma_controller_free()	Andy Shevchenko
	After introduction a few years ago the devm_acpi_dma_controller_free() was never used. Drop it. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20241007150436.2183575-2-andriy.shevchenko@linux.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2024-10-14	Merge patch series "ovl: file descriptors based layer setup"	Christian Brauner
	Christian Brauner <brauner@kernel.org> says: Currently overlayfs only allows specifying layers through path names. This is inconvenient for users such as systemd that want to assemble an overlayfs mount purely based on file descriptors. When porting overlayfs to the new mount api I already mentioned this. This enables user to specify both: fsconfig(fd_overlay, FSCONFIG_SET_FD, "upperdir+", NULL, fd_upper); fsconfig(fd_overlay, FSCONFIG_SET_FD, "workdir+", NULL, fd_work); fsconfig(fd_overlay, FSCONFIG_SET_FD, "lowerdir+", NULL, fd_lower1); fsconfig(fd_overlay, FSCONFIG_SET_FD, "lowerdir+", NULL, fd_lower2); in addition to: fsconfig(fd_overlay, FSCONFIG_SET_STRING, "upperdir+", "/upper", 0); fsconfig(fd_overlay, FSCONFIG_SET_STRING, "workdir+", "/work", 0); fsconfig(fd_overlay, FSCONFIG_SET_STRING, "lowerdir+", "/lower1", 0); fsconfig(fd_overlay, FSCONFIG_SET_STRING, "lowerdir+", "/lower2", 0); The selftest contain an example for this. * patches from https://lore.kernel.org/r/20241014-work-overlayfs-v3-0-32b3fed1286e@kernel.org: selftests: add overlayfs fd mounting selftests selftests: use shared header Documentation,ovl: document new file descriptor based layers ovl: specify layers via file descriptors fs: add helper to use mount option as path or fd Link: https://lore.kernel.org/r/20241014-work-overlayfs-v3-0-32b3fed1286e@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-14	fs: add helper to use mount option as path or fd	Christian Brauner
	Allow filesystems to use a mount option either as a file or path. Link: https://lore.kernel.org/r/20241014-work-overlayfs-v3-1-32b3fed1286e@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-10-14	sched: Improve cache locality of RSEQ concurrency IDs for intermittent workloads	Mathieu Desnoyers
	commit 223baf9d17f25 ("sched: Fix performance regression introduced by mm_cid") introduced a per-mm/cpu current concurrency id (mm_cid), which keeps a reference to the concurrency id allocated for each CPU. This reference expires shortly after a 100ms delay. These per-CPU references keep the per-mm-cid data cache-local in situations where threads are running at least once on each CPU within each 100ms window, thus keeping the per-cpu reference alive. However, intermittent workloads behaving in bursts spaced by more than 100ms on each CPU exhibit bad cache locality and degraded performance compared to purely per-cpu data indexing, because concurrency IDs are allocated over various CPUs and cores, therefore losing cache locality of the associated data. Introduce the following changes to improve per-mm-cid cache locality: - Add a "recent_cid" field to the per-mm/cpu mm_cid structure to keep track of which mm_cid value was last used, and use it as a hint to attempt re-allocating the same concurrency ID the next time this mm/cpu needs to allocate a concurrency ID, - Add a per-mm CPUs allowed mask, which keeps track of the union of CPUs allowed for all threads belonging to this mm. This cpumask is only set during the lifetime of the mm, never cleared, so it represents the union of all the CPUs allowed since the beginning of the mm lifetime (note that the mm_cpumask() is really arch-specific and tailored to the TLB flush needs, and is thus _not_ a viable approach for this), - Add a per-mm nr_cpus_allowed to keep track of the weight of the per-mm CPUs allowed mask (for fast access), - Add a per-mm max_nr_cid to keep track of the highest number of concurrency IDs allocated for the mm. This is used for expanding the concurrency ID allocation within the upper bound defined by: min(mm->nr_cpus_allowed, mm->mm_users) When the next unused CID value reaches this threshold, stop trying to expand the cid allocation and use the first available cid value instead. Spreading allocation to use all the cid values within the range [ 0, min(mm->nr_cpus_allowed, mm->mm_users) - 1 ] improves cache locality while preserving mm_cid compactness within the expected user limits, - In __mm_cid_try_get, only return cid values within the range [ 0, mm->nr_cpus_allowed ] rather than [ 0, nr_cpu_ids ]. This prevents allocating cids above the number of allowed cpus in rare scenarios where cid allocation races with a concurrent remote-clear of the per-mm/cpu cid. This improvement is made possible by the addition of the per-mm CPUs allowed mask, - In sched_mm_cid_migrate_to, use mm->nr_cpus_allowed rather than t->nr_cpus_allowed. This criterion was really meant to compare the number of mm->mm_users to the number of CPUs allowed for the entire mm. Therefore, the prior comparison worked fine when all threads shared the same CPUs allowed mask, but not so much in scenarios where those threads have different masks (e.g. each thread pinned to a single CPU). This improvement is made possible by the addition of the per-mm CPUs allowed mask. * Benchmarks Each thread increments 16kB worth of 8-bit integers in bursts, with a configurable delay between each thread's execution. Each thread run one after the other (no threads run concurrently). The order of thread execution in the sequence is random. The thread execution sequence begins again after all threads have executed. The 16kB areas are allocated with rseq_mempool and indexed by either cpu_id, mm_cid (not cache-local), or cache-local mm_cid. Each thread is pinned to its own core. Testing configurations: 8-core/1-L3: Use 8 cores within a single L3 24-core/24-L3: Use 24 cores, 1 core per L3 192-core/24-L3: Use 192 cores (all cores in the system) 384-thread/24-L3: Use 384 HW threads (all HW threads in the system) Intermittent workload delays between threads: 200ms, 10ms. Hardware: CPU(s): 384 On-line CPU(s) list: 0-383 Vendor ID: AuthenticAMD Model name: AMD EPYC 9654 96-Core Processor Thread(s) per core: 2 Core(s) per socket: 96 Socket(s): 2 Caches (sum of all): L1d: 6 MiB (192 instances) L1i: 6 MiB (192 instances) L2: 192 MiB (192 instances) L3: 768 MiB (24 instances) Each result is an average of 5 test runs. The cache-local speedup is calculated as: (cache-local mm_cid) / (mm_cid). Intermittent workload delay: 200ms per-cpu mm_cid cache-local mm_cid cache-local speedup (ns) (ns) (ns) 8-core/1-L3 1374 19289 1336 14.4x 24-core/24-L3 2423 26721 1594 16.7x 192-core/24-L3 2291 15826 2153 7.3x 384-thread/24-L3 1874 13234 1907 6.9x Intermittent workload delay: 10ms per-cpu mm_cid cache-local mm_cid cache-local speedup (ns) (ns) (ns) 8-core/1-L3 662 756 686 1.1x 24-core/24-L3 1378 3648 1035 3.5x 192-core/24-L3 1439 10833 1482 7.3x 384-thread/24-L3 1503 10570 1556 6.8x [ This deprecates the prior "sched: NUMA-aware per-memory-map concurrency IDs" patch series with a simpler and more general approach. ] [ This patch applies on top of v6.12-rc1. ] Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Marco Elver <elver@google.com> Link: https://lore.kernel.org/lkml/20240823185946.418340-1-mathieu.desnoyers@efficios.com/
2024-10-14	Merge branch 'tip/sched/urgent'	Peter Zijlstra
	Sync with sched/urgent to avoid conflicts. Signed-off-by: Peter Zijlstra <peterz@infradead.org>
2024-10-14	memstick: Constify struct memstick_device_id	Christophe JAILLET
	'struct memstick_device_id' are not modified in these drivers. Constifying this structure moves some data to a read-only section, so increases overall security. Update memstick_dev_match(), memstick_bus_match() and struct memstick_driver accordingly. On a x86_64, with allmodconfig, as an example: Before: ====== text data bss dec hex filename 74055 3455 88 77598 12f1e drivers/memstick/core/ms_block.o After: ===== text data bss dec hex filename 74087 3423 88 77598 12f1e drivers/memstick/core/ms_block.o Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/6509d6f6ed64193f04e747a98ccea7492c976ca8.1727540434.git.christophe.jaillet@wanadoo.fr Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2024-10-14	mmc: core: Add definitions for SD UHS-II cards	Victor Shih
	Add UHS-II specific data structures for commands and defines for registers, as described in Part 1 UHS-II Addendum Version 1.01. UHS-II related definitions are listed below: 1. UHS-II card capability: sd_uhs2_caps{} 2. UHS-II configuration: sd_uhs2_config{} 3. UHS-II register I/O address and register field definitions: sd_uhs2.h Signed-off-by: Jason Lai <jason.lai@genesyslogic.com.tw> Signed-off-by: Victor Shih <victor.shih@genesyslogic.com.tw> Link: https://lore.kernel.org/r/20240913102836.6144-6-victorshihgli@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2024-10-14	mmc: core: Extend support for mmc regulators with a vqmmc2	Ulf Hansson
	To allow an additional external regulator to be controlled by an mmc host driver, let's add support for a vqmmc2 regulator to the mmc core. For an SD UHS-II interface the vqmmc2 regulator may correspond to the so called vdd2 supply, as described by the SD spec. Initially, only 1.8V is needed, hence limit the new helper function, mmc_regulator_set_vqmmc2() to this too. Note that, to allow for flexibility mmc host drivers need to manage the enable/disable of the vqmmc2 regulator themselves, while the regulator is looked up through the common mmc_regulator_get_supply(). Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20240913102836.6144-5-victorshihgli@gmail.com
2024-10-14	mmc: core: Announce successful insertion of an SD UHS-II card	Ulf Hansson
	To inform the users about SD UHS-II cards, let's extend the print at card insertion with a "UHS-II" substring. Within this change, it seems reasonable to convert from using "ultra high speed" into "UHS-I speed", for the UHS-I type, as it should makes it more clear. Note that, the new print for UHS-II cards doesn't include the actual selected speed mode. Instead, this is going to be added from subsequent change. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20240913102836.6144-4-victorshihgli@gmail.com
2024-10-14	mmc: core: Prepare to support SD UHS-II cards	Ulf Hansson
	The SD UHS-II interface was introduced to the SD spec v4.00 several years ago. The interface is fundamentally different from an electrical and a protocol point of view, comparing to the legacy SD interface. However, the legacy SD protocol is supported through a specific transport layer (SD-TRAN) defined in the UHS-II addendum of the spec. This allows the SD card to be managed in a very similar way as a legacy SD card, hence a lot of code can be re-used to support these new types of cards through the mmc subsystem. Moreover, an SD card that supports the UHS-II interface shall also be backwards compatible with the legacy SD interface, which allows a UHS-II card to be inserted into a legacy slot. As a matter of fact, this is already supported by mmc subsystem as of today. To prepare to add support for UHS-II, this change puts the basic foundation in the mmc core in place, allowing it to be more easily reviewed before subsequent changes implements the actual support. Basically, the approach here adds a new UHS-II bus_ops type and adds a separate initialization path for the UHS-II card. The intent is to avoid us from sprinkling the legacy initialization path, but also to simplify implementation of the UHS-II specific bits. At this point, there is only one new host ops added to manage the various ios settings needed for UHS-II. Additional host ops that are needed, are being added from subsequent changes. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20240913102836.6144-3-victorshihgli@gmail.com
2024-10-14	mmc: core: Add open-ended Ext memory addressing	Avri Altman
	For open-ended read/write - just send CMD22 before issuing the command. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Avri Altman <avri.altman@wdc.com> Link: https://lore.kernel.org/r/20241006051148.160278-5-avri.altman@wdc.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2024-10-14	mmc: sd: Add Extension memory addressing	Avri Altman
	SDUC memory addressing spans beyond 2TB and up to 128TB. Therefore, 38 bits are required to access the entire memory space of all sectors. Those extra 6 bits are to be carried by CMD22 prior of sending read/write/erase commands: CMD17, CMD18, CMD24, CMD25, CMD32, and CMD33. CMD22 will carry the higher order 6 bits, and must precedes any of the above commands even if it targets sector < 2TB. No error related to address or length is indicated in CMD22 but rather in the read/write command itself. Tested-by: Ricky WU <ricky_wu@realtek.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Avri Altman <avri.altman@wdc.com> Link: https://lore.kernel.org/r/20241006051148.160278-3-avri.altman@wdc.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2024-10-14	mmc: sd: SDUC Support Recognition	Avri Altman
	Ultra Capacity SD cards (SDUC) was already introduced in SD7.0. Those cards support capacity larger than 2TB and up to including 128TB. ACMD41 was extended to support the host-card handshake during initialization. The card expects that the HCS & HO2T bits to be set in the command argument, and sets the applicable bits in the R3 returned response. On the contrary, if a SDUC card is inserted to a non-supporting host, it will never respond to this ACMD41 until eventually, the host will timed out and give up. Also, add SD CSD version 3.0 - designated for SDUC, and properly parse the csd register as the c_size field got expanded to 28 bits. Do not enable SDUC for now - leave it to the last patch in the series. Tested-by: Ricky WU <ricky_wu@realtek.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Avri Altman <avri.altman@wdc.com> Link: https://lore.kernel.org/r/20241006051148.160278-2-avri.altman@wdc.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2024-10-14	mmc: core: Add SD card quirk for broken poweroff notification	Keita Aihara
	GIGASTONE Gaming Plus microSD cards manufactured on 02/2022 report that they support poweroff notification and cache, but they are not working correctly. Flush Cache bit never gets cleared in sd_flush_cache() and Poweroff Notification Ready bit also never gets set to 1 within 1 second from the end of busy of CMD49 in sd_poweroff_notify(). This leads to I/O error and runtime PM error state. I observed that the same card manufactured on 01/2024 works as expected. This problem seems similar to the Kingston cards fixed with commit c467c8f08185 ("mmc: Add MMC_QUIRK_BROKEN_SD_CACHE for Kingston Canvas Go Plus from 11/2019") and should be handled using quirks. CID for the problematic card is here. 12345641535443002000000145016200 Manufacturer ID is 0x12 and defined as CID_MANFID_GIGASTONE as of now, but would like comments on what naming is appropriate because MID list is not public and not sure it's right. Signed-off-by: Keita Aihara <keita.aihara@sony.com> Link: https://lore.kernel.org/r/20240913094417.GA4191647@sony.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2024-10-14	sched/fair: Fix external p->on_rq users	Peter Zijlstra
	Sean noted that ever since commit 152e11f6df29 ("sched/fair: Implement delayed dequeue") KVM's preemption notifiers have started mis-classifying preemption vs blocking. Notably p->on_rq is no longer sufficient to determine if a task is runnable or blocked -- the aforementioned commit introduces tasks that remain on the runqueue even through they will not run again, and should be considered blocked for many cases. Add the task_is_runnable() helper to classify things and audit all external users of the p->on_rq state. Also add a few comments. Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue") Reported-by: Sean Christopherson <seanjc@google.com> Tested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20241010091843.GK33184@noisy.programming.kicks-ass.net
2024-10-14	Merge tag 'v6.12-rc3' of ↵	Bartosz Golaszewski
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into gpio/for-next Linux 6.12-rc3
2024-10-14	drivers/base: Remove unused auxiliary_find_device	Dr. David Alan Gilbert
	auxiliary_find_device has been unused since commit 1c5de097bea3 ("net/mlx5: Fix mlx5_get_next_dev() peer device matching") which was the only use since it was originally added. Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20240929141112.69824-1-linux@treblig.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-14	list: Remove duplicated and unused macro list_for_each_reverse	Zijun Hu
	Remove macro list_for_each_reverse due to below reasons: - it is same as list_for_each_prev. - it is not used by current kernel tree. Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Link: https://lore.kernel.org/r/20240917-fix_list-v2-1-d2914665e89f@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-14	Merge 6.12-rc3 into usb-next	Greg Kroah-Hartman
	We need the USB fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-13	misc: keba: Add UART devices	Gerhard Engleder
	Add support for the UART auxiliary devices. This enables access to up to 3 different UARTs, which are implemented in the FPGA. Signed-off-by: Gerhard Engleder <eg@keba.com> Link: https://lore.kernel.org/r/20241011191257.19702-9-gerhard@engleder-embedded.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-13	misc: keba: Add battery device	Gerhard Engleder
	Add support for the battery auxiliary device. This enables monitoring of the battery. Signed-off-by: Gerhard Engleder <eg@keba.com> Link: https://lore.kernel.org/r/20241011191257.19702-8-gerhard@engleder-embedded.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-13	misc: keba: Add fan device	Gerhard Engleder
	Add support for the fan auxiliary device. This enables monitoring of the fan. Signed-off-by: Gerhard Engleder <eg@keba.com> Link: https://lore.kernel.org/r/20241011191257.19702-7-gerhard@engleder-embedded.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>