summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-09-04printk: nbcon: Introduce printer kthreadsThomas Gleixner
Provide the main implementation for running a printer kthread per nbcon console that is takeover/handover aware. This includes: - new mandatory write_thread() callback - kthread creation - kthread main printing loop - kthread wakeup mechanism - kthread shutdown kthread creation is a bit tricky because consoles may register before kthreads can be created. In such cases, registration will succeed, even though no kthread exists. Once kthreads can be created, an early_initcall will set @printk_kthreads_ready. If there are no registered boot consoles, the early_initcall creates the kthreads for all registered nbcon consoles. If kthread creation fails, the related console is unregistered. If there are registered boot consoles when @printk_kthreads_ready is set, no kthreads are created until the final boot console unregisters. Once kthread creation finally occurs, @printk_kthreads_running is set so that the system knows kthreads are available for all registered nbcon consoles. If @printk_kthreads_running is already set when the console is registering, the kthread is created during registration. If kthread creation fails, the registration will fail. Until @printk_kthreads_running is set, console printing occurs directly via the console_lock. kthread shutdown on system shutdown/reboot is necessary to ensure the printer kthreads finish their printing so that the system can cleanly transition back to direct printing via the console_lock in order to reliably push out the final shutdown/reboot messages. @printk_kthreads_running is cleared before shutting down the individual kthreads. The kthread uses a new mandatory write_thread() callback that is called with both device_lock() and the console context acquired. The console ownership handling is necessary for synchronization against write_atomic() which is synchronized only via the console context ownership. The device_lock() serializes acquiring the console context with NBCON_PRIO_NORMAL. It is needed in case the device_lock() does not disable preemption. It prevents the following race: CPU0 CPU1 [ task A ] nbcon_context_try_acquire() # success with NORMAL prio # .unsafe == false; // safe for takeover [ schedule: task A -> B ] WARN_ON() nbcon_atomic_flush_pending() nbcon_context_try_acquire() # success with EMERGENCY prio # flushing nbcon_context_release() # HERE: con->nbcon_state is free # to take by anyone !!! nbcon_context_try_acquire() # success with NORMAL prio [ task B ] [ schedule: task B -> A ] nbcon_enter_unsafe() nbcon_context_can_proceed() BUG: nbcon_context_can_proceed() returns "true" because the console is owned by a context on CPU0 with NBCON_PRIO_NORMAL. But it should return "false". The console is owned by a context from task B and we do the check in a context from task A. Note that with these changes, the printer kthreads do not yet take over full responsibility for nbcon printing during normal operation. These changes only focus on the lifecycle of the kthreads. Co-developed-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Thomas Gleixner (Intel) <tglx@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240904120536.115780-7-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-09-04printk: nbcon: Init @nbcon_seq to highest possibleJohn Ogness
When initializing an nbcon console, have nbcon_alloc() set @nbcon_seq to the highest possible sequence number. For all practical purposes, this will guarantee that the console will have nothing to print until later when @nbcon_seq is set to the proper initial printing value. This will be particularly important once kthread printing is introduced because nbcon_alloc() can create/start the kthread before the desired initial sequence number is known. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240904120536.115780-6-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-09-04printk: nbcon: Add context to usable() and emit()John Ogness
The nbcon consoles will have two callbacks to be used for different contexts. In order to determine if an nbcon console is usable, console_is_usable() must know if it is a context that will need to use the optional write_atomic() callback. Also, nbcon_emit_next_record() must know which callback it needs to call. Add an extra parameter @use_atomic to console_is_usable() and nbcon_emit_next_record() to specify this. Since so far only the write_atomic() callback exists, @use_atomic is set to true for all call sites. For legacy consoles, @use_atomic is not used. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240904120536.115780-5-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-09-04printk: Flush console on unregister_console()John Ogness
Ensure consoles have flushed pending records before unregistering. The console should print up to at least its related "console disabled" record. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240904120536.115780-4-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-09-04printk: Fail pr_flush() if before SYSTEM_SCHEDULINGJohn Ogness
A follow-up change adds pr_flush() to console unregistration. However, with boot consoles unregistration can happen very early if there are also regular consoles registering as well. In this case the pr_flush() is not important because all consoles are flushed when checking the initial console sequence number. Allow pr_flush() to fail if @system_state has not yet reached SYSTEM_SCHEDULING. This avoids might_sleep() and msleep() explosions that would otherwise occur: [ 0.436739][ T0] printk: legacy console [ttyS0] enabled [ 0.439820][ T0] printk: legacy bootconsole [earlyser0] disabled [ 0.446822][ T0] BUG: scheduling while atomic: swapper/0/0/0x00000002 [ 0.450491][ T0] 1 lock held by swapper/0/0: [ 0.457897][ T0] #0: ffffffff82ae5f88 (console_mutex){+.+.}-{4:4}, at: console_list_lock+0x20/0x70 [ 0.463141][ T0] Modules linked in: [ 0.465307][ T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.10.0-rc1+ #372 [ 0.469394][ T0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 [ 0.474402][ T0] Call Trace: [ 0.476246][ T0] <TASK> [ 0.481473][ T0] dump_stack_lvl+0x93/0xb0 [ 0.483949][ T0] dump_stack+0x10/0x20 [ 0.486256][ T0] __schedule_bug+0x68/0x90 [ 0.488753][ T0] __schedule+0xb9b/0xd80 [ 0.491179][ T0] ? lock_release+0xb5/0x270 [ 0.493732][ T0] schedule+0x43/0x170 [ 0.495998][ T0] schedule_timeout+0xc5/0x1e0 [ 0.498634][ T0] ? __pfx_process_timeout+0x10/0x10 [ 0.501522][ T0] ? msleep+0x13/0x50 [ 0.503728][ T0] msleep+0x3c/0x50 [ 0.505847][ T0] __pr_flush.constprop.0.isra.0+0x56/0x500 [ 0.509050][ T0] ? _printk+0x58/0x80 [ 0.511332][ T0] ? lock_is_held_type+0x9c/0x110 [ 0.514106][ T0] unregister_console_locked+0xe1/0x450 [ 0.517144][ T0] register_console+0x509/0x620 [ 0.519827][ T0] ? __pfx_univ8250_console_init+0x10/0x10 [ 0.523042][ T0] univ8250_console_init+0x24/0x40 [ 0.525845][ T0] console_init+0x43/0x210 [ 0.528280][ T0] start_kernel+0x493/0x980 [ 0.530773][ T0] x86_64_start_reservations+0x18/0x30 [ 0.533755][ T0] x86_64_start_kernel+0xae/0xc0 [ 0.536473][ T0] common_startup_64+0x12c/0x138 [ 0.539210][ T0] </TASK> And then the kernel goes into an infinite loop complaining about: 1. releasing a pinned lock 2. unpinning an unpinned lock 3. bad: scheduling from the idle thread! 4. goto 1 Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240904120536.115780-3-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-09-04printk: nbcon: Add function for printers to reacquire ownershipJohn Ogness
Since ownership can be lost at any time due to handover or takeover, a printing context _must_ be prepared to back out immediately and carefully. However, there are scenarios where the printing context must reacquire ownership in order to finalize or revert hardware changes. One such example is when interrupts are disabled during printing. No other context will automagically re-enable the interrupts. For this case, the disabling context _must_ reacquire nbcon ownership so that it can re-enable the interrupts. Provide nbcon_reacquire_nobuf() for exactly this purpose. It allows a printing context to reacquire ownership using the same priority as its previous ownership. Note that after a successful reacquire the printing context will have no output buffer because that has been lost. This function cannot be used to resume printing. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240904120536.115780-2-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-09-04printk: nbcon: Use raw_cpu_ptr() instead of open codingJohn Ogness
There is no need to open code a non-migration-checking this_cpu_ptr(). That is exactly what raw_cpu_ptr() is. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/87plpum4jw.fsf@jogness.linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-09-04printk: Use the BITS_PER_LONG macroJinjie Ruan
sizeof(unsigned long) * 8 is the number of bits in an unsigned long variable, replace it with BITS_PER_LONG macro to make it simpler. Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Reviewed-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240903035358.308482-1-ruanjinjie@huawei.com Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21lockdep: Mark emergency sections in lockdep splatsJohn Ogness
Mark emergency sections wherever multiple lines of lock debugging output are generated. In an emergency section, every printk() call will attempt to directly flush to the consoles using the EMERGENCY priority. Note that debug_show_all_locks() and lockdep_print_held_locks() rely on their callers to enter the emergency section. This is because these functions can also be called in non-emergency situations (such as sysrq). Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-36-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21rcu: Mark emergency sections in rcu stallsJohn Ogness
Mark emergency sections wherever multiple lines of rcu stall information are generated. In an emergency section, every printk() call will attempt to directly flush to the consoles using the EMERGENCY priority. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Acked-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/20240820063001.36405-35-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21panic: Mark emergency section in oopsJohn Ogness
Mark an emergency section beginning with oops_enter() until the end of oops_exit(). In this section, every printk() call will attempt to directly flush to the consoles using the EMERGENCY priority. The very end of oops_exit() performs a kmsg_dump(). This is not included in the emergency section because it is another flushing mechanism that should occur after the consoles have flushed the oops messages. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-34-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21panic: Mark emergency section in warnThomas Gleixner
Mark the full contents of __warn() as an emergency section. In this section, every printk() call will attempt to directly flush to the consoles using the EMERGENCY priority. Co-developed-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Thomas Gleixner (Intel) <tglx@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-33-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: nbcon: Implement emergency sectionsThomas Gleixner
In emergency situations (something has gone wrong but the system continues to operate), usually important information (such as a backtrace) is generated via printk(). This information should be pushed out to the consoles ASAP. Add per-CPU emergency nesting tracking because an emergency can arise while in an emergency situation. Add functions to mark the beginning and end of emergency sections where the urgent messages are generated. Perform direct console flushing at the emergency priority if the current CPU is in an emergency state and it is safe to do so. Note that the emergency state is not system-wide. While one CPU is in an emergency state, another CPU may attempt to print console messages at normal priority. Also note that printk() already attempts to flush consoles in the caller context for normal priority. However, follow-up changes will introduce printing kthreads, in which case the normal priority printk() calls will offload to the kthreads. Co-developed-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Thomas Gleixner (Intel) <tglx@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-32-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: Add helper for flush type logicJohn Ogness
There are many call sites where console flushing occur. Depending on the system state and types of consoles, the flush methods to use are different. A flush call site generally must consider: @have_boot_console @have_nbcon_console @have_legacy_console @legacy_allow_panic_sync is_printk_preferred() and take into account the current CPU state: NBCON_PRIO_NORMAL NBCON_PRIO_EMERGENCY NBCON_PRIO_PANIC in order to decide if it should: flush nbcon directly via atomic_write() callback flush legacy directly via console_unlock flush legacy via offload to irq_work All of these call sites use their own logic to make this decision, which is complicated and error prone. Especially later when two more flush methods will be introduced: flush nbcon via offload to kthread flush legacy via offload to kthread Introduce a new internal struct console_flush_type that specifies which console flushing methods should be used in the context of the caller. Introduce a helper function to fill out console_flush_type to be used for flushing call sites. Replace the logic of all flushing call sites to use the new helper. This change standardizes behavior, leading to both fixes and optimizations across various call sites. For instance, in console_cpu_notify(), the new logic ensures that nbcon consoles are flushed when they aren’t managed by the legacy loop. Similarly, in console_flush_on_panic(), the system no longer needs to flush nbcon consoles if none are present. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-31-john.ogness@linutronix.de [pmladek@suse.com: Updated the commit message.] Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: Coordinate direct printing in panicJohn Ogness
If legacy and nbcon consoles are registered and the nbcon consoles are allowed to flush (i.e. no boot consoles registered), the legacy consoles will no longer perform direct printing on the panic CPU until after the backtrace has been stored. This will give the safe nbcon consoles a chance to print the panic messages before allowing the unsafe legacy consoles to print. If no nbcon consoles are registered or they are not allowed to flush because boot consoles are registered, there is no change in behavior (i.e. legacy consoles will always attempt to print from the printk() caller context). Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-30-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: Track nbcon consolesJohn Ogness
Add a global flag @have_nbcon_console to identify if any nbcon consoles are registered. This will be used in follow-up commits to preserve legacy behavior when no nbcon consoles are registered. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-29-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: Avoid console_lock dance if no legacy or boot consolesJohn Ogness
Currently the console lock is used to attempt legacy-type printing even if there are no legacy or boot consoles registered. If no such consoles are registered, the console lock does not need to be taken. Add tracking of legacy console registration and use it with boot console tracking to avoid unnecessary code paths, i.e. do not use the console lock if there are no boot consoles and no legacy consoles. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-28-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: nbcon: Add unsafe flushing on panicJohn Ogness
Add nbcon_atomic_flush_unsafe() to flush all nbcon consoles using the write_atomic() callback and allowing unsafe hostile takeovers. Call this at the end of panic() as a final attempt to flush any pending messages. Note that legacy consoles use unsafe methods for flushing from the beginning of panic (see bust_spinlocks()). Therefore, systems using both legacy and nbcon consoles may still fail to see panic messages due to unsafe legacy console usage. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-27-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: Flush nbcon consoles first on panicJohn Ogness
In console_flush_on_panic(), flush the nbcon consoles before flushing legacy consoles. The legacy write() callbacks are not fully safe when oops_in_progress is set. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-26-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: nbcon: Flush new records on device_release()John Ogness
There may be new records that were added while a driver was holding the nbcon context for non-printing purposes. These new records must be flushed by the nbcon_device_release() context because no other context will do it. If boot consoles are registered, the legacy loop is used (either direct or per irq_work) to handle the flushing. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-25-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: Add is_printk_legacy_deferred()John Ogness
If printk has been explicitly deferred or is called from NMI context, legacy console printing must be deferred to an irq_work context. Introduce a helper function is_printk_legacy_deferred() for a CPU to query if it must defer legacy console printing. In follow-up commits this helper will be needed at other call sites as well. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-24-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: nbcon: Use nbcon consoles in console_flush_all()John Ogness
Allow nbcon consoles to print messages in the legacy printk() caller context (printing via unlock) by integrating them into console_flush_all(). The write_atomic() callback is used for printing. Provide nbcon_legacy_emit_next_record(), which acts as the nbcon variant of console_emit_next_record(). Call this variant within console_flush_all() for nbcon consoles. Since nbcon consoles use their own @nbcon_seq variable to track the next record to print, this also must be appropriately handled in console_flush_all(). Note that the legacy printing logic uses @handover to detect handovers for printing all consoles. For nbcon consoles, handovers/takeovers occur on a per-console basis and thus do not cause the console_flush_all() loop to abort. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-23-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: Track registered boot consolesJohn Ogness
Unfortunately it is not known if a boot console and a regular (legacy or nbcon) console use the same hardware. For this reason they must not be allowed to print simultaneously. For legacy consoles this is not an issue because they are already synchronized with the boot consoles using the console lock. However nbcon consoles can be triggered separately. Add a global flag @have_boot_console to identify if any boot consoles are registered. This will be used in follow-up commits to ensure that boot consoles and nbcon consoles cannot print simultaneously. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-22-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: nbcon: Provide function to flush using write_atomic()Thomas Gleixner
Provide nbcon_atomic_flush_pending() to perform flushing of all registered nbcon consoles using their write_atomic() callback. Unlike console_flush_all(), nbcon_atomic_flush_pending() will only flush up through the newest record at the time of the call. This prevents a CPU from printing unbounded when other CPUs are adding records. If new records are added while flushing, it is expected that the dedicated printer threads will print those records. If the printer thread is not available (which is always the case at this point in the rework), nbcon_atomic_flush_pending() _will_ flush all records in the ringbuffer. Unlike console_flush_all(), nbcon_atomic_flush_pending() will fully flush one console before flushing the next. This helps to guarantee that a block of pending records (such as a stack trace in an emergency situation) can be printed atomically at once before releasing console ownership. nbcon_atomic_flush_pending() is safe in any context because it uses write_atomic() and acquires with unsafe_takeover disabled. Co-developed-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Thomas Gleixner (Intel) <tglx@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-21-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: nbcon: Add helper to assign priority based on CPU stateJohn Ogness
Add a helper function to use the current state of the CPU to determine which priority to assign to the printing context. The EMERGENCY priority handling is added in a follow-up commit. It will use a per-CPU variable. Note: nbcon_device_try_acquire(), which is used by console drivers to acquire the nbcon console for non-printing activities, is hard-coded to always use NORMAL priority. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-20-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: Add @flags argument for console_is_usable()John Ogness
The caller of console_is_usable() usually needs @console->flags for its own checks. Rather than having console_is_usable() read its own copy, make the caller pass in the @flags. This also ensures that the caller saw the same @flags value. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-19-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: Let console_is_usable() handle nbconJohn Ogness
The nbcon consoles use a different printing callback. For nbcon consoles, check for the write_atomic() callback instead of write(). Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-18-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: Make console_is_usable() available to nbcon.cJohn Ogness
Move console_is_usable() as-is into internal.h so that it can be used by nbcon printing functions as well. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-17-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: nbcon: Do not rely on proxy headersJohn Ogness
The headers kernel.h, serial_core.h, and console.h allow for the definitions of many types and functions from other headers. Rather than relying on these as proxy headers, explicitly include all headers providing needed definitions. Also sort the list alphabetically to be able to easily detect duplicates. Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-16-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21serial: core: Acquire nbcon context in port->lock wrapperJohn Ogness
Currently the port->lock wrappers uart_port_lock(), uart_port_unlock() (and their variants) only lock/unlock the spin_lock. If the port is an nbcon console that has implemented the write_atomic() callback, the wrappers must also acquire/release the console context and mark the region as unsafe. This allows general port->lock synchronization to be synchronized against the nbcon write_atomic() callback. Note that __uart_port_using_nbcon() relies on the port->lock being held while a console is added and removed from the console list (i.e. all uart nbcon drivers *must* take the port->lock in their device_lock() callbacks). Signed-off-by: John Ogness <john.ogness@linutronix.de> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-15-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21nbcon: Add API to acquire context for non-printing operationsJohn Ogness
Provide functions nbcon_device_try_acquire() and nbcon_device_release() which will try to acquire the nbcon console ownership with NBCON_PRIO_NORMAL and mark it unsafe for handover/takeover. These functions are to be used together with the device-specific locking when performing non-printing activities on the console device. They will allow synchronization against the atomic_write() callback which will be serialized, for higher priority contexts, only by acquiring the console context ownership. Pitfalls: The API requires to be called in a context with migration disabled because it uses per-CPU variables internally. The context is set unsafe for a takeover all the time. It guarantees full serialization against any atomic_write() caller except for the final flush in panic() which might try an unsafe takeover. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-14-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21console: Improve console_srcu_read_flags() commentsJohn Ogness
It was not clear when exactly console_srcu_read_flags() must be used vs. directly reading @console->flags. Refactor and clarify that console_srcu_read_flags() is only needed if the console is registered or the caller is in a context where the registration status of the console may change (due to another context). The function requires the caller holds @console_srcu, which will ensure that the caller sees an appropriate @flags value for the registered console and that exit/cleanup routines will not run if the console is in the process of unregistration. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-13-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21serial: core: Introduce wrapper to set @uart_port->consJohn Ogness
Introduce uart_port_set_cons() as a wrapper to set @cons of a uart_port. The wrapper sets @cons under the port lock in order to prevent @cons from disappearing while another context is holding the port lock. This is necessary for a follow-up commit relating to the port lock wrappers, which rely on @cons not changing between lock and unlock. Signed-off-by: John Ogness <john.ogness@linutronix.de> Tested-by: Théo Lebrun <theo.lebrun@bootlin.com> # EyeQ5, AMBA-PL011 Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20240820063001.36405-12-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21serial: core: Provide low-level functions to lock portJohn Ogness
It will be necessary at times for the uart nbcon console drivers to acquire the port lock directly (without the additional nbcon functionality of the port lock wrappers). These are special cases such as the implementation of the device_lock()/device_unlock() callbacks or for internal port lock wrapper synchronization. Provide low-level variants __uart_port_lock_irqsave() and __uart_port_unlock_irqrestore() for this purpose. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20240820063001.36405-11-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: nbcon: Use driver synchronization while (un)registeringJohn Ogness
Console drivers typically have to deal with access to the hardware via user input/output (such as an interactive login shell) and output of kernel messages via printk() calls. They use some classic driver-specific locking mechanism in most situations. But console->write_atomic() callbacks, used by nbcon consoles, are synchronized only by acquiring the console context. The synchronization via the console context ownership is possible only when the console driver is registered. It is when a particular device driver is connected with a particular console driver. The two synchronization mechanisms must be synchronized between each other. It is tricky because the console context ownership is quite special. It might be taken over by a higher priority context. Also CPU migration must be disabled. The most tricky part is to (dis)connect these two mechanisms during the console (un)registration. Use the driver-specific locking callbacks: device_lock(), device_unlock(). They allow taking the device-specific lock while the device is being (un)registered by the related console driver. For example, these callbacks lock/unlock the port lock for serial port drivers. Note that the driver-specific locking is only needed during (un)register if it is an nbcon console with the write_atomic() callback implemented. If write_atomic() is not implemented, the driver should never attempt to access the hardware without first acquiring its driver-specific lock. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-10-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: nbcon: Add callbacks to synchronize with driverJohn Ogness
Console drivers typically must deal with access to the hardware via user input/output (such as an interactive login shell) and output of kernel messages via printk() calls. To provide the necessary synchronization, usually some driver-specific locking mechanism is used (for example, the port spinlock for uart serial consoles). Until now, usage of this driver-specific locking has been hidden from the printk-subsystem and implemented within the various console callbacks. However, nbcon consoles would need to use it even in the generic code. Add device_lock() and device_unlock() callback which will need to get implemented by nbcon consoles. The callbacks will use whatever synchronization mechanism the driver is using for itself. The minimum requirement is to prevent CPU migration. It would allow a context friendly acquiring of nbcon console ownership in non-emergency and non-panic context. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-9-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: nbcon: Add detailed doc for write_atomic()John Ogness
The write_atomic() callback has special requirements and is allowed to use special helper functions. Provide detailed documentation of the callback so that a developer has a chance of implementing it correctly. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-8-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: nbcon: Remove return value for write_atomic()John Ogness
The return value of write_atomic() does not provide any useful information. On the contrary, it makes things more complicated for the caller to appropriately deal with the information. Change write_atomic() to not have a return value. If the message did not get printed due to loss of ownership, the caller will notice this on its own. If ownership was not lost, it will be assumed that the driver successfully printed the message and the sequence number for that console will be incremented. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-7-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: nbcon: Clarify rules of the owner/waiter matchingJohn Ogness
The functions nbcon_owner_matches() and nbcon_waiter_matches() use a minimal set of data to determine if a context matches. The existing kerneldoc and comments were not clear enough and caused the printk folks to re-prove that the functions are indeed reliable in all cases. Update and expand the explanations so that it is clear that the implementations are sufficient for all cases. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-6-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: Check printk_deferred_enter()/_exit() usageSebastian Andrzej Siewior
Add validation that printk_deferred_enter()/_exit() are called in non-migration contexts. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-5-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: Properly deal with nbcon consoles on seq initPetr Mladek
If a non-boot console is registering and boot consoles exist, the consoles are flushed before being unregistered. This allows the non-boot console to continue where the boot console left off. If for whatever reason flushing fails, the lowest seq found from any of the enabled boot consoles is used. Until now con->seq was checked. However, if it is an nbcon boot console, the function nbcon_seq_read() must be used to read seq because con->seq is not updated for nbcon consoles. Check if it is an nbcon boot console and if so call nbcon_seq_read() to read seq. Also, avoid usage of con->seq as temporary storage of the starting record. Instead, rename console_init_seq() to get_init_console_seq() and just return the value. For nbcon consoles set the sequence via nbcon_seq_force(), for legacy consoles set con->seq. The cleaned design should make sure that the value stays and is set before the console is added to the console list. It also unifies the sequence number initialization for legacy and nbcon consoles. Reviewed-by: John Ogness <john.ogness@linutronix.de> Link: https://lore.kernel.org/r/20240820063001.36405-4-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: nbcon: Consolidate alloc() and init()John Ogness
Rather than splitting the nbcon allocation and initialization into two pieces, perform all initialization in nbcon_alloc(). Later, the initial sequence is calculated and can be explicitly set using nbcon_seq_force(). This removes the need for the strong rules of nbcon_init() that even included a BUG_ON(). Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240820063001.36405-3-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-21printk: Add notation to console_srcu lockingJohn Ogness
kernel/printk/printk.c:284:5: sparse: sparse: context imbalance in 'console_srcu_read_lock' - wrong count at exit include/linux/srcu.h:301:9: sparse: sparse: context imbalance in 'console_srcu_read_unlock' - unexpected unlock Fixes: 6c4afa79147e ("printk: Prepare for SRCU console list protection") Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Acked-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/20240820063001.36405-2-john.ogness@linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-08-19Merge tag 'printk-for-6.11-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk fix from Petr Mladek: - Do not block printk on non-panic CPUs when they are dumping backtraces * tag 'printk-for-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: printk/panic: Allow cpu backtraces to be written into ringbuffer during panic
2024-08-18Linux 6.11-rc4v6.11-rc4Linus Torvalds
2024-08-18Merge tag 'driver-core-6.11-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are two driver fixes for regressions from 6.11-rc1 due to the driver core change making a structure in a driver core callback const. These were missed by all testing EXCEPT for what Bart happened to be running, so I appreciate the fixes provided here for some odd/not-often-used driver subsystems that nothing else happened to catch. Both of these fixes have been in linux-next all week with no reported issues" * tag 'driver-core-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: mips: sgi-ip22: Fix the build ARM: riscpc: ecard: Fix the build
2024-08-18Merge tag 'char-misc-6.11-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char / misc fixes from Greg KH: "Here are some small char/misc fixes for 6.11-rc4 to resolve reported problems. Included in here are: - fastrpc revert of a change that broke userspace - xillybus fixes for reported issues Half of these have been in linux-next this week with no reported problems, I don't know if the last bit of xillybus driver changes made it in, but they are 'obviously correct' so will be safe :)" * tag 'char-misc-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: char: xillybus: Check USB endpoints when probing device char: xillybus: Refine workqueue handling Revert "misc: fastrpc: Restrict untrusted app to attach to privileged PD" char: xillybus: Don't destroy workqueue from work item running on it
2024-08-18Merge tag 'tty-6.11-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty / serial fixes from Greg KH: "Here are some small tty and serial driver fixes for 6.11-rc4 to resolve some reported problems. Included in here are: - conmakehash.c userspace build issues - fsl_lpuart driver fix - 8250_omap revert for reported regression - atmel_serial rts flag fix All of these have been in linux-next this week with no reported issues" * tag 'tty-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: Revert "serial: 8250_omap: Set the console genpd always on if no console suspend" tty: atmel_serial: use the correct RTS flag. tty: vt: conmakehash: remove non-portable code printing comment header tty: serial: fsl_lpuart: mark last busy before uart_add_one_port
2024-08-18Merge tag 'usb-6.11-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB / Thunderbolt driver fixes from Greg KH: "Here are some small USB and Thunderbolt driver fixes for 6.11-rc4 to resolve some reported issues. Included in here are: - thunderbolt driver fixes for reported problems - typec driver fixes - xhci fixes - new device id for ljca usb driver All of these have been in linux-next this week with no reported issues" * tag 'usb-6.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: xhci: Fix Panther point NULL pointer deref at full-speed re-enumeration usb: misc: ljca: Add Lunar Lake ljca GPIO HID to ljca_gpio_hids[] Revert "usb: typec: tcpm: clear pd_event queue in PORT_RESET" usb: typec: ucsi: Fix the return value of ucsi_run_command() usb: xhci: fix duplicate stall handling in handle_tx_event() usb: xhci: Check for xhci->interrupters being allocated in xhci_mem_clearup() thunderbolt: Mark XDomain as unplugged when router is removed thunderbolt: Fix memory leaks in {port|retimer}_sb_regs_write()
2024-08-18Merge tag 'for-6.11-rc3-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull more btrfs fixes from David Sterba: "A more fixes. We got reports that shrinker added in 6.10 still causes latency spikes and the fixes don't handle all corner cases. Due to summer holidays we're taking a shortcut to disable it for release builds and will fix it in the near future. - only enable extent map shrinker for DEBUG builds, temporary quick fix to avoid latency spikes for regular builds - update target inode's ctime on unlink, mandated by POSIX - properly take lock to read/update block group's zoned variables - add counted_by() annotations" * tag 'for-6.11-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: only enable extent map shrinker for DEBUG builds btrfs: zoned: properly take lock to read/update block group's zoned variables btrfs: tree-checker: add dev extent item checks btrfs: update target inode's ctime on unlink btrfs: send: annotate struct name_cache_entry with __counted_by()