summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2025-03-05arch: x86: add IPC mailbox accessor function and add SoC register accessDavid E. Box
- Exports intel_pmc_ipc() for host access to the PMC IPC mailbox - Enables the host to access specific SoC registers through the PMC firmware using IPC commands. This access method is necessary for registers that are not available through direct Memory-Mapped I/O (MMIO), which is used for other accessible parts of the PMC. Signed-off-by: David E. Box <david.e.box@linux.intel.com> Signed-off-by: Chao Qin <chao.qin@intel.com> Signed-off-by: Choong Yong Liang <yong.liang.choong@linux.intel.com> Acked-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://patch.msgid.link/20250227121522.1802832-4-yong.liang.choong@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-05ASoC: Merge up fixesMark Brown
Merge branch 'for-6.14' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into asoc-6.15 to avoid a bunch of add/add conflicts.
2025-03-05fs/pipe: remove buggy and unused 'helper' functionLinus Torvalds
While looking for incorrect users of the pipe head/tail fields (see commit c27c66afc449: "fs/pipe: Fix pipe_occupancy() with 16-bit indexes"), I found a bug in pipe_discard_from() that looked entirely broken. However, the fix is trivial: this buggy function isn't actually called by anything, so let's just remove it ASAP. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-03-05file: add fput and file_ref_put routines optimized for use when closing a fdMateusz Guzik
Vast majority of the time closing a file descriptor also operates on the last reference, where a regular fput usage will result in 2 atomics. This can be changed to only suffer 1. See commentary above file_ref_put_close() for more information. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/r/20250305123644.554845-2-mjguzik@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-05include/linux/pipe_fs_i: Add htmldoc annotation for "head_tail" memberK Prateek Nayak
Add htmldoc annotation for the newly introduced "head_tail" member describing it to be a union of the pipe_inode_info's @head and @tail members. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/lkml/20250305204609.5e64768e@canb.auug.org.au/ Fixes: 3d252160b818 ("fs/pipe: Read pipe->{head,tail} atomically outside pipe->mutex") Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-03-05fs/pipe: Fix pipe_occupancy() with 16-bit indexesLinus Torvalds
The pipe_occupancy() logic implicitly relied on the natural unsigned modulo arithmetic in C, but that doesn't work for the new 'pipe_index_t' case, since any arithmetic will be done in 'int' (and here we had also made it 'unsigned int' due to the function call boundary). So make the modulo arithmetic explicit by casting the result to the proper type. Cc: Oleg Nesterov <oleg@redhat.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Swapnil Sapkal <swapnil.sapkal@amd.com> Cc: Alexey Gladkov <legion@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Link: https://lore.kernel.org/all/CAHk-=wjyHsGLx=rxg6PKYBNkPYAejgo7=CbyL3=HGLZLsAaJFQ@mail.gmail.com/ Fixes: 3d252160b818 ("fs/pipe: Read pipe->{head,tail} atomically outside pipe->mutex") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-03-05treewide: fix typo 'unsigned __init128' -> 'unsigned __int128'Vincent Mailhol
"int" was misspelled as "init" the code comments in the bits.h and const.h files. Fix the typo. CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Yury Norov <yury.norov@gmail.com>
2025-03-05gpio: Hide valid_mask from direct assignmentsMatti Vaittinen
The valid_mask member of the struct gpio_chip is unconditionally written by the GPIO core at driver registration. Current documentation does not mention this but just says the valid_mask is used if it's not NULL. This lured me to try populating it directly in the GPIO driver probe instead of using the init_valid_mask() callback. It took some retries with different bitmaps and eventually a bit of code-reading to understand why the valid_mask was not obeyed. I could've avoided this trial and error if the valid_mask was hidden in the struct gpio_device instead of being a visible member of the struct gpio_chip. Help the next developer who decides to directly populate the valid_mask in struct gpio_chip by hiding the valid_mask in struct gpio_device and keep it internal to the GPIO core. Suggested-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/4547ca90d910d60cab3d56d864d59ddde47a5e93.1741180097.git.mazziesaccount@gmail.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-03-05gpio: Add a valid_mask getterMatti Vaittinen
The valid_mask member of the struct gpio_chip is unconditionally written by the GPIO core at driver registration. It shouldn't be directly populated by drivers. This can be prevented by moving it from the struct gpio_chip to struct gpio_device, which is internal to the GPIO core. As a preparatory step, provide a getter function which can be used by those drivers which need the valid_mask information. Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/026f9d78502eca883bfe3faeb684e23d5d6c5e84.1741180097.git.mazziesaccount@gmail.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-03-05posix-clock: Store file pointer in struct posix_clock_contextWojtek Wasko
File descriptor based pc_clock_*() operations of dynamic posix clocks have access to the file pointer and implement permission checks in the generic code before invoking the relevant dynamic clock callback. Character device operations (open, read, poll, ioctl) do not implement a generic permission control and the dynamic clock callbacks have no access to the file pointer to implement them. Extend struct posix_clock_context with a struct file pointer and initialize it in posix_clock_open(), so that all dynamic clock callbacks can access it. Acked-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Wojtek Wasko <wwasko@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-03-05pidfs: record exit code and cgroupid at exitChristian Brauner
Record the exit code and cgroupid in release_task() and stash in struct pidfs_exit_info so it can be retrieved even after the task has been reaped. Link: https://lore.kernel.org/r/20250305-work-pidfs-kill_on_last_close-v3-5-c8c3d8361705@kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-05fscrypt: Change fscrypt_encrypt_pagecache_blocks() to take a folioMatthew Wilcox (Oracle)
ext4 and ceph already have a folio to pass; f2fs needs to be properly converted but this will do for now. This removes a reference to page->index and page->mapping as well as removing a call to compound_head(). Signed-off-by: "Matthew Wilcox (Oracle)" <willy@infradead.org> Link: https://lore.kernel.org/r/20250304170224.523141-1-willy@infradead.org Acked-by: Eric Biggers <ebiggers@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-05VFS: Change vfs_mkdir() to return the dentry.NeilBrown
vfs_mkdir() does not guarantee to leave the child dentry hashed or make it positive on success, and in many such cases the filesystem had to use a different dentry which it can now return. This patch changes vfs_mkdir() to return the dentry provided by the filesystems which is hashed and positive when provided. This reduces the number of cases where the resulting dentry is not positive to a handful which don't deserve extra efforts. The only callers of vfs_mkdir() which are interested in the resulting inode are in-kernel filesystem clients: cachefiles, nfsd, smb/server. The only filesystems that don't reliably provide the inode are: - kernfs, tracefs which these clients are unlikely to be interested in - cifs in some configurations would need to do a lookup to find the created inode, but doesn't. cifs cannot be exported via NFS, is unlikely to be used by cachefiles, and smb/server only has a soft requirement for the inode, so this is unlikely to be a problem in practice. - hostfs, nfs, cifs may need to do a lookup (rarely for NFS) and it is possible for a race to make that lookup fail. Actual failure is unlikely and providing callers handle negative dentries graceful they will fail-safe. So this patch removes the lookup code in nfsd and smb/server and adjusts them to fail safe if a negative dentry is provided: - cache-files already fails safe by restarting the task from the top - it still does with this change, though it no longer calls cachefiles_put_directory() as that will crash if the dentry is negative. - nfsd reports "Server-fault" which it what it used to do if the lookup failed. This will never happen on any file-systems that it can actually export, so this is of no consequence. I removed the fh_update() call as that is not needed and out-of-place. A subsequent nfsd_create_setattr() call will call fh_update() when needed. - smb/server only wants the inode to call ksmbd_smb_inherit_owner() which updates ->i_uid (without calling notify_change() or similar) which can be safely skipping on cifs (I hope). If a different dentry is returned, the first one is put. If necessary the fact that it is new can be determined by comparing pointers. A new dentry will certainly have a new pointer (as the old is put after the new is obtained). Similarly if an error is returned (via ERR_PTR()) the original dentry is put. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Link: https://lore.kernel.org/r/20250227013949.536172-7-neilb@suse.de Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-05nfs: change mkdir inode_operation to return alternate dentry if needed.NeilBrown
mkdir now allows a different dentry to be returned which is sometimes relevant for nfs. This patch changes the nfs_rpc_ops mkdir op to return a dentry, and passes that back to the caller. The mkdir nfs_rpc_op will return NULL if the original dentry should be used. This matches the mkdir inode_operation. nfs4_do_create() is duplicated to nfs4_do_mkdir() which is changed to handle the specifics of directories. Consequently the current special handling for directories is removed from nfs4_do_create() Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Link: https://lore.kernel.org/r/20250227013949.536172-6-neilb@suse.de Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-05mm/slab: call kmalloc_noprof() unconditionally in kmalloc_array_noprof()Ye Bin
If 'n' or 'size' isn't builtin constant, we used to call __kmalloc() before commit 7bd230a26648 ("mm/slab: enable slab allocation tagging for kmalloc and friends"), which inadvertedly changed both paths to kmalloc_noprof(). As Harry Yoo points out we can just call kmalloc_noprof() unconditionally. If the compiler knows n and size are constants it doesn't guarantee that bytes will be also seen as constant, and that is the important test in kmalloc_noprof() anyway, so we can just defer to it always. [ vbabka@suse.cz: change as Harry suggested and adjust commit log ] Fixes: 7bd230a26648 ("mm/slab: enable slab allocation tagging for kmalloc and friends") Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Harry Yoo <harry.yoo@oracle.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-03-04Merge branches 'docs.2025.02.04a', 'lazypreempt.2025.03.04a', ↵Boqun Feng
'misc.2025.03.04a', 'srcu.2025.02.05a' and 'torture.2025.02.05a'
2025-03-04rcu: Use _full() API to debug synchronize_rcu()Uladzislau Rezki (Sony)
Switch for using of get_state_synchronize_rcu_full() and poll_state_synchronize_rcu_full() pair to debug a normal synchronize_rcu() call. Just using "not" full APIs to identify if a grace period is passed or not might lead to a false-positive kernel splat. It can happen, because get_state_synchronize_rcu() compresses both normal and expedited states into one single unsigned long value, so a poll_state_synchronize_rcu() can miss GP-completion when synchronize_rcu()/synchronize_rcu_expedited() concurrently run. To address this, switch to poll_state_synchronize_rcu_full() and get_state_synchronize_rcu_full() APIs, which use separate variables for expedited and normal states. Reported-by: cheung wall <zzqq0103.hey@gmail.com> Closes: https://lore.kernel.org/lkml/Z5ikQeVmVdsWQrdD@pc636/T/ Fixes: 988f569ae041 ("rcu: Reduce synchronize_rcu() latency") Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/20250227131613.52683-3-urezki@gmail.com Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2025-03-04Flush console log from kernel_power_off()Paul E. McKenney
Kernels built with CONFIG_PREEMPT_RT=y can lose significant console output and shutdown time, which hides shutdown-time RCU issues from rcutorture. Therefore, make pr_flush() public and invoke it after then last print in kernel_power_off(). [ paulmck: Apply John Ogness feedback. ] [ paulmck: Appy Sebastian Andrzej Siewior feedback. ] [ paulmck: Apply kernel test robot feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Link: https://lore.kernel.org/r/5f743488-dc2a-4f19-bdda-cf50b9314832@paulmck-laptop Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2025-03-04rcu-tasks: Move RCU Tasks self-tests to core_initcall()Paul E. McKenney
The timer and hrtimer softirq processing has moved to dedicated threads for kernels built with CONFIG_IRQ_FORCED_THREADING=y. This results in timers not expiring until later in early boot, which in turn causes the RCU Tasks self-tests to hang in kernels built with CONFIG_PROVE_RCU=y, which further causes the entire kernel to hang. One fix would be to make timers work during this time, but there are no known users of RCU Tasks grace periods during that time, so no justification for the added complexity. Not yet, anyway. This commit therefore moves the call to rcu_init_tasks_generic() from kernel_init_freeable() to a core_initcall(). This works because the timer and hrtimer kthreads are created at early_initcall() time. Fixes: 49a17639508c3 ("softirq: Use a dedicated thread for timer wakeups on PREEMPT_RT.") Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: <linux-trace-kernel@vger.kernel.org> Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2025-03-04ppp: use IFF_NO_QUEUE in virtual interfacesQingfang Deng
For PPPoE, PPTP, and PPPoL2TP, the start_xmit() function directly forwards packets to the underlying network stack and never returns anything other than 1. So these interfaces do not require a qdisc, and the IFF_NO_QUEUE flag should be set. Introduces a direct_xmit flag in struct ppp_channel to indicate when IFF_NO_QUEUE should be applied. The flag is set in ppp_connect_channel() for relevant protocols. While at it, remove the usused latency member from struct ppp_channel. Signed-off-by: Qingfang Deng <dqfext@gmail.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250301135517.695809-1-dqfext@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04PCI: hotplug: Inline pci_hp_{create,remove}_module_link()Lukas Wunner
For no apparent reason, the pci_hp_{create,remove}_module_link() helpers live in slot.c, even though they're only called from two functions in pci_hotplug_core.c. Inline the helpers to reduce code size and number of exported symbols. Link: https://lore.kernel.org/r/c207f03cfe32ae9002d9b453001a1dd63d9ab3fb.1740501868.git.lukas@wunner.de Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-04PCI: hotplug: Drop superfluous pci_hotplug_slot_listLukas Wunner
The PCI hotplug core keeps a list of all registered slots. Its sole purpose is to WARN() on slot removal if another slot is using the same name. But this can never happen because already on slot creation, an error is returned and multiple messages are emitted if a slot's name is duplicated: pci_hp_register() __pci_hp_register() __pci_hp_initialize() pci_create_slot() kobject_init_and_add() kobject_add_varg() kobject_add_internal() create_dir() sysfs_create_dir_ns() kernfs_create_dir_ns() sysfs_warn_dup() pr_warn("cannot create duplicate filename ...") pr_err("%s failed for %s with -EEXIST, ..."); Drop the superfluous list. Link: https://lore.kernel.org/r/603735bc50eb370bc7f1c358441ac671360bab25.1740501868.git.lukas@wunner.de Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-04ACPI: platform_profile: Add support for hidden choicesMario Limonciello
When two drivers don't support all the same profiles the legacy interface only exports the common profiles. This causes problems for cases where one driver uses low-power but another uses quiet because the result is that neither is exported to sysfs. To allow two drivers to disagree, add support for "hidden choices". Hidden choices are platform profiles that a driver supports to be compatible with the platform profile of another driver. Fixes: 688834743d67 ("ACPI: platform_profile: Allow multiple handlers") Reported-by: Antheas Kapenekakis <lkml@antheas.dev> Closes: https://lore.kernel.org/platform-driver-x86/e64b771e-3255-42ad-9257-5b8fc6c24ac9@gmx.de/T/#mc068042dd29df36c16c8af92664860fc4763974b Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Tested-by: Antheas Kapenekakis <lkml@antheas.dev> Tested-by: Derek J. Clark <derekjohn.clark@gmail.com> Acked-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://patch.msgid.link/20250228170155.2623386-2-superm1@kernel.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-03-04x86/preempt: Move preempt count to percpu hot sectionBrian Gerst
No functional change. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Uros Bizjak <ubizjak@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250303165246.2175811-4-brgerst@gmail.com
2025-03-04percpu: Introduce percpu hot sectionBrian Gerst
Add a subsection to the percpu data for frequently accessed variables that should remain cached on each processor. These varables should not be accessed from other processors to avoid cacheline bouncing. This will replace the pcpu_hot struct on x86, and open up similar functionality to other architectures and the kernel core. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Uros Bizjak <ubizjak@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250303165246.2175811-2-brgerst@gmail.com
2025-03-04Merge branch 'x86/asm' into x86/core, to pick up dependent commitsIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-03-04fs/pipe: Read pipe->{head,tail} atomically outside pipe->mutexLinus Torvalds
pipe_readable(), pipe_writable(), and pipe_poll() can read "pipe->head" and "pipe->tail" outside of "pipe->mutex" critical section. When the head and the tail are read individually in that order, there is a window for interruption between the two reads in which both the head and the tail can be updated by concurrent readers and writers. One of the problematic scenarios observed with hackbench running multiple groups on a large server on a particular pipe inode is as follows: pipe->head = 36 pipe->tail = 36 hackbench-118762 [057] ..... 1029.550548: pipe_write: *wakes up: pipe not full* hackbench-118762 [057] ..... 1029.550548: pipe_write: head: 36 -> 37 [tail: 36] hackbench-118762 [057] ..... 1029.550548: pipe_write: *wake up next reader 118740* hackbench-118762 [057] ..... 1029.550548: pipe_write: *wake up next writer 118768* hackbench-118768 [206] ..... 1029.55055X: pipe_write: *writer wakes up* hackbench-118768 [206] ..... 1029.55055X: pipe_write: head = READ_ONCE(pipe->head) [37] ... CPU 206 interrupted (exact wakeup was not traced but 118768 did read head at 37 in traces) hackbench-118740 [057] ..... 1029.550558: pipe_read: *reader wakes up: pipe is not empty* hackbench-118740 [057] ..... 1029.550558: pipe_read: tail: 36 -> 37 [head = 37] hackbench-118740 [057] ..... 1029.550559: pipe_read: *pipe is empty; wakeup writer 118768* hackbench-118740 [057] ..... 1029.550559: pipe_read: *sleeps* hackbench-118766 [185] ..... 1029.550592: pipe_write: *New writer comes in* hackbench-118766 [185] ..... 1029.550592: pipe_write: head: 37 -> 38 [tail: 37] hackbench-118766 [185] ..... 1029.550592: pipe_write: *wakes up reader 118766* hackbench-118740 [185] ..... 1029.550598: pipe_read: *reader wakes up; pipe not empty* hackbench-118740 [185] ..... 1029.550599: pipe_read: tail: 37 -> 38 [head: 38] hackbench-118740 [185] ..... 1029.550599: pipe_read: *pipe is empty* hackbench-118740 [185] ..... 1029.550599: pipe_read: *reader sleeps; wakeup writer 118768* ... CPU 206 switches back to writer hackbench-118768 [206] ..... 1029.550601: pipe_write: tail = READ_ONCE(pipe->tail) [38] hackbench-118768 [206] ..... 1029.550601: pipe_write: pipe_full()? (u32)(37 - 38) >= 16? Yes hackbench-118768 [206] ..... 1029.550601: pipe_write: *writer goes back to sleep* [ Tasks 118740 and 118768 can then indefinitely wait on each other. ] The unsigned arithmetic in pipe_occupancy() wraps around when "pipe->tail > pipe->head" leading to pipe_full() returning true despite the pipe being empty. The case of genuine wraparound of "pipe->head" is handled since pipe buffer has data allowing readers to make progress until the pipe->tail wraps too after which the reader will wakeup a sleeping writer, however, mistaking the pipe to be full when it is in fact empty can lead to readers and writers waiting on each other indefinitely. This issue became more problematic and surfaced as a hang in hackbench after the optimization in commit aaec5a95d596 ("pipe_read: don't wake up the writer if the pipe is still full") significantly reduced the number of spurious wakeups of writers that had previously helped mask the issue. To avoid missing any updates between the reads of "pipe->head" and "pipe->write", unionize the two with a single unsigned long "pipe->head_tail" member that can be loaded atomically. Using "pipe->head_tail" to read the head and the tail ensures the lockless checks do not miss any updates to the head or the tail and since those two are only updated under "pipe->mutex", it ensures that the head is always ahead of, or equal to the tail resulting in correct calculations. [ prateek: commit log, testing on x86 platforms. ] Reported-and-debugged-by: Swapnil Sapkal <swapnil.sapkal@amd.com> Closes: https://lore.kernel.org/lkml/e813814e-7094-4673-bc69-731af065a0eb@amd.com/ Reported-by: Alexey Gladkov <legion@kernel.org> Closes: https://lore.kernel.org/all/Z8Wn0nTvevLRG_4m@example.org/ Fixes: 8cefc107ca54 ("pipe: Use head and tail pointers for the ring, not cursor and length") Tested-by: Swapnil Sapkal <swapnil.sapkal@amd.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Tested-by: Alexey Gladkov <legion@kernel.org> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-03-04Coresight: Add Coresight TMC Control Unit driverJie Gan
The Coresight TMC Control Unit hosts miscellaneous configuration registers which control various features related to TMC ETR sink. Based on the trace ID, which is programmed in the related CTCU ATID register of a specific ETR, trace data with that trace ID gets into the ETR buffer, while other trace data gets dropped. Enabling source device sets one bit of the ATID register based on source device's trace ID. Disabling source device resets the bit according to the source device's trace ID. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Jie Gan <quic_jiegan@quicinc.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20250303032931.2500935-10-quic_jiegan@quicinc.com
2025-03-04Coresight: Change to read the trace ID from coresight_pathJie Gan
The source device can directly read the trace ID from the coresight_path which result in etm_read_alloc_trace_id and etm4_read_alloc_trace_id being deleted. Co-developed-by: James Clark <james.clark@linaro.org> Signed-off-by: James Clark <james.clark@linaro.org> Signed-off-by: Jie Gan <quic_jiegan@quicinc.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20250303032931.2500935-7-quic_jiegan@quicinc.com
2025-03-04Coresight: Introduce a new struct coresight_pathJie Gan
Introduce a new strcuture, 'struct coresight_path', to store the data that utilized by the devices in the path. The coresight_path will be built/released by coresight_build_path/coresight_release_path functions. Signed-off-by: Jie Gan <quic_jiegan@quicinc.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20250303032931.2500935-5-quic_jiegan@quicinc.com
2025-03-04mm: Remove wait_on_page_locked()Matthew Wilcox (Oracle)
This compatibility wrapper has no callers left, so remove it. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-04mm: Remove grab_cache_page_write_begin()Matthew Wilcox (Oracle)
All callers have now been converted to use folios, so remove this compatibility wrapper. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-04mm: Remove wait_for_stable_page()Matthew Wilcox (Oracle)
The last caller has been converted to call folio_wait_stable(), so we can remove this wrapper. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-04Merge tag 'wireless-next-2025-03-04-v2' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Johannes Berg says: ==================== First 6.15 material: * cfg80211/mac80211 - remove cooked monitor support - strict mode for better AP testing - basic EPCS support - OMI RX bandwidth reduction support * rtw88 - preparation for RTL8814AU support * rtw89 - use wiphy_lock/wiphy_work - preparations for MLO - BT-Coex improvements - regulatory support in firmware files * iwlwifi - preparations for the new iwlmld sub-driver * tag 'wireless-next-2025-03-04-v2' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (128 commits) wifi: iwlwifi: remove mld/roc.c wifi: mac80211: refactor populating mesh related fields in sinfo wifi: cfg80211: reorg sinfo structure elements for mesh wifi: iwlwifi: Fix spelling mistake "Increate" -> "Increase" wifi: iwlwifi: add Debug Host Command APIs wifi: iwlwifi: add IWL_MAX_NUM_IGTKS macro wifi: iwlwifi: add OMI bandwidth reduction APIs wifi: iwlwifi: remove mvm prefix from iwl_mvm_d3_end_notif wifi: iwlwifi: remember if the UATS table was read successfully wifi: iwlwifi: export iwl_get_lari_config_bitmap wifi: iwlwifi: add support for external 32 KHz clock wifi: iwlwifi: mld: add a debug level for EHT prints wifi: iwlwifi: mld: add a debug level for PTP prints wifi: iwlwifi: remove mvm prefix from iwl_mvm_esr_mode_notif wifi: iwlwifi: use 0xff instead of 0xffffffff for invalid wifi: iwlwifi: location api cleanup wifi: cfg80211: expose update timestamp to drivers wifi: mac80211: add ieee80211_iter_chan_contexts_mtx wifi: mac80211: fix integer overflow in hwmp_route_info_get() wifi: mac80211: Fix possible integer promotion issue ... ==================== Link: https://patch.msgid.link/20250304125605.127914-3-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04ftrace: Add print_function_args()Sven Schnelle
Add a function to decode argument types with the help of BTF. Will be used to display arguments in the function and function graph tracer. It can only handle simply arguments and up to FTRACE_REGS_MAX_ARGS number of arguments. When it hits a max, it will print ", ...": page_to_skb(vi=0xffff8d53842dc980, rq=0xffff8d53843a0800, page=0xfffffc2e04337c00, offset=6160, len=64, truesize=1536, ...) And if it hits an argument that is not recognized, it will print the raw value and the type of argument it is: make_vfsuid(idmap=0xffffffff87f99db8, fs_userns=0xffffffff87e543c0, kuid=0x0 (STRUCT)) __pti_set_user_pgtbl(pgdp=0xffff8d5384ab47f8, pgd=0x110e74067 (STRUCT)) Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Guo Ren <guoren@kernel.org> Cc: Donglin Peng <dolinux.peng@gmail.com> Cc: Zheng Yejian <zhengyejian@huaweicloud.com> Link: https://lore.kernel.org/20250227185822.639418500@goodmis.org Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Co-developed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-04Coresight: Add trace_id function to retrieving the trace IDJie Gan
Add 'trace_id' function pointer in coresight_ops. It's responsible for retrieving the device's trace ID. Co-developed-by: James Clark <james.clark@linaro.org> Signed-off-by: James Clark <james.clark@linaro.org> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Jie Gan <quic_jiegan@quicinc.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20250303032931.2500935-3-quic_jiegan@quicinc.com
2025-03-04Coresight: Add support for new APB clock nameJie Gan
Add support for new APB clock-name. If the function fails to obtain the clock with the name "apb_pclk", it will attempt to acquire the clock with the name "apb". Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Jie Gan <quic_jiegan@quicinc.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20250303032931.2500935-2-quic_jiegan@quicinc.com
2025-03-04irqchip/davinci-cp-intc: Remove public headerBartosz Golaszewski
There are no more users of irq-davinci-cp-intc.h (da830.c doesn't use any of its symbols). Remove the header and make the driver stop using the config structure. [ tglx: Mop up coding style ] Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250304131815.86549-1-brgl@bgdev.pl
2025-03-04Add STM32MP25 SPI NOR supportMark Brown
Merge series from patrice.chotard@foss.st.com: This series adds SPI NOR support for STM32MP25 SoCs from STMicroelectronics. On STM32MP25 SoCs family, an Octo Memory Manager block manages the muxing, the memory area split, the chip select override and the time constraint between its 2 Octo SPI children. Due to these depedencies, this series adds support for: - Octo Memory Manager driver (not applied for SPI). - Octo SPI driver. - yaml schema for Octo Memory Manager and Octo SPI drivers. The device tree files adds Octo Memory Manager and its 2 associated Octo SPI chidren in stm32mp251.dtsi and adds SPI NOR support in stm32mp257f-ev1 board.
2025-03-04net: plumb extack in __dev_change_net_namespace()Nicolas Dichtel
It could be hard to understand why the netlink command fails. For example, if dev->netns_immutable is set, the error is "Invalid argument". Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-04net: rename netns_local to netns_immutableNicolas Dichtel
The name 'netns_local' is confusing. A following commit will export it via netlink, so let's use a more explicit name. Reported-by: Eric Dumazet <edumazet@google.com> Suggested-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-04mtd: spinand: Add read retry supportCheng Ming Lin
When the host ECC fails to correct the data error of NAND device, there's a special read for data recovery method which can be setup by the host for the next read. There are several retry levels that can be attempted until the lost data is recovered or definitely assumed lost. Signed-off-by: Cheng Ming Lin <chengminglin@mxic.com.tw> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2025-03-04gpiolib: of: Handle threecell GPIO chipsLinus Walleij
When describing GPIO controllers in the device tree, the ambition of device tree to describe the hardware may require a three-cell scheme: gpios = <&gpio instance offset flags>; This implements support for this scheme in the gpiolib OF core. Drivers that want to handle multiple gpiochip instances from one OF node need to implement a callback similar to this to determine if a certain gpio chip is a pointer to the right instance (pseudo-code): struct my_gpio { struct gpio_chip gcs[MAX_CHIPS]; }; static bool my_of_node_instance_match(struct gpio_chip *gc unsigned int instance) { struct my_gpio *mg = gpiochip_get_data(gc); if (instance >= MAX_CHIPS) return false; return (gc == &mg->gcs[instance]); } probe() { struct my_gpio *mg; struct gpio_chip *gc; int i, ret; for (i = 0; i++; i < MAX_CHIPS) { gc = &mg->gcs[i]; /* This tells gpiolib we have several instances per node */ gc->of_gpio_n_cells = 3; gc->of_node_instance_match = my_of_node_instance_match; gc->base = -1; ... ret = devm_gpiochip_add_data(dev, gc, mg); if (ret) return ret; } } Rename the "simple" of_xlate function to "twocell" which is closer to what it actually does. In the device tree bindings, the provide node needs to specify #gpio-cells = <3>; where the first cell is the instance number: gpios = <&gpio instance offset flags>; Conversely ranges need to have four cells: gpio-ranges = <&pinctrl instance gpio_offset pin_offset count>; Reviewed-by: Alex Elder <elder@riscstar.com> Tested-by: Yixun Lan <dlan@gentoo.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://lore.kernel.org/r/20250225-gpio-ranges-fourcell-v3-2-860382ba4713@linaro.org Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2025-03-04Merge branch 'x86/cpu' into x86/asm, to pick up dependent commitsIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-03-04<linux/sizes.h>: Cover all possible x86 CPU cache sizesAhmed S. Darwish
Add size macros for 24/192/384 Kilobytes and 3/6/12/18/24 Megabytes. With that, the x86 subsystem can avoid locally defining its own macros for CPU cache sizes. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/20250304085152.51092-31-darwi@linutronix.de
2025-03-04Merge branch 'x86/urgent' into x86/cpu, to pick up dependent commitsIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-03-04wait: avoid spurious calls to prepare_to_wait_event() in ___wait_event()Mateusz Guzik
In vast majority of cases the condition determining whether the thread can proceed is true after the first wake up. However, even in that case the thread ends up calling into prepare_to_wait_event() again, suffering a spurious irq + lock trip. Then it calls into finish_wait() to unlink itself. Note that in case of a pending signal the work done by prepare_to_wait_event() gets ignored even without the change. pre-check the condition after waking up instead. Stats gathared during a kernel build: bpftrace -e 'kprobe:prepare_to_wait_event,kprobe:finish_wait \ { @[probe] = count(); }' @[kprobe:finish_wait]: 392483 @[kprobe:prepare_to_wait_event]: 778690 As in calls to prepare_to_wait_event() almost double calls to finish_wait(). This evens out with the patch. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/r/20250303230409.452687-4-mjguzik@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-04pipe: cache 2 pages instead of 1Mateusz Guzik
User data is kept in a circular buffer backed by pages allocated as needed. Only having space for one spare is still prone to having to resort to allocation / freeing. In my testing this decreases page allocs by 60% during a kernel build. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/r/20250303230409.452687-3-mjguzik@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-04perf/core: Detach 'struct perf_cpu_pmu_context' and 'struct pmu' lifetimesPeter Zijlstra
In prepration for being able to unregister a PMU with existing events, it becomes important to detach struct perf_cpu_pmu_context lifetimes from that of struct pmu. Notably struct perf_cpu_pmu_context embeds a struct perf_event_pmu_context that can stay referenced until the last event goes. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Ravi Bangoria <ravi.bangoria@amd.com> Link: https://lore.kernel.org/r/20241104135518.760214287@infradead.org
2025-03-04perf/core: Merge struct pmu::pmu_disable_count into struct ↵Peter Zijlstra
perf_cpu_pmu_context::pmu_disable_count Because it makes no sense to have two per-cpu allocations per pmu. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Ravi Bangoria <ravi.bangoria@amd.com> Link: https://lore.kernel.org/r/20241104135518.518730578@infradead.org