summaryrefslogtreecommitdiff
path: root/Documentation/trace/rv/monitor_sched.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/trace/rv/monitor_sched.rst')
-rw-r--r--Documentation/trace/rv/monitor_sched.rst307
1 files changed, 269 insertions, 38 deletions
diff --git a/Documentation/trace/rv/monitor_sched.rst b/Documentation/trace/rv/monitor_sched.rst
index 24b2c62a3bc2..3f8381ad9ec7 100644
--- a/Documentation/trace/rv/monitor_sched.rst
+++ b/Documentation/trace/rv/monitor_sched.rst
@@ -40,26 +40,6 @@ defined in by Daniel Bristot in [1].
Currently we included the following:
-Monitor tss
-~~~~~~~~~~~
-
-The task switch while scheduling (tss) monitor ensures a task switch happens
-only in scheduling context, that is inside a call to `__schedule`::
-
- |
- |
- v
- +-----------------+
- | thread | <+
- +-----------------+ |
- | |
- | schedule_entry | schedule_exit
- v |
- sched_switch |
- +--------------- |
- | sched |
- +--------------> -+
-
Monitor sco
~~~~~~~~~~~
@@ -144,26 +124,277 @@ does not enable preemption::
|
scheduling_contex -+
-Monitor sncid
-~~~~~~~~~~~~~
+Monitor sts
+~~~~~~~~~~~
-The schedule not called with interrupt disabled (sncid) monitor ensures
-schedule is not called with interrupt disabled::
+The schedule implies task switch (sts) monitor ensures a task switch happens
+only in scheduling context and up to once, as well as scheduling occurs with
+interrupts enabled but no task switch can happen before interrupts are
+disabled. When the next task picked for execution is the same as the previously
+running one, no real task switch occurs but interrupts are disabled nonetheless::
- |
- |
- v
- schedule_entry +--------------+
- schedule_exit | |
- +----------------- | can_sched |
- | | |
- +----------------> | | <+
- +--------------+ |
- | |
- | irq_disable | irq_enable
- v |
- |
- cant_sched -+
+ irq_entry |
+ +----+ |
+ v | v
+ +------------+ irq_enable #===================# irq_disable
+ | | ------------> H H irq_entry
+ | cant_sched | <------------ H H irq_enable
+ | | irq_disable H can_sched H --------------+
+ +------------+ H H |
+ H H |
+ +---------------> H H <-------------+
+ | #===================#
+ | |
+ schedule_exit | schedule_entry
+ | v
+ | +-------------------+ irq_enable
+ | | scheduling | <---------------+
+ | +-------------------+ |
+ | | |
+ | | irq_disable +--------+ irq_entry
+ | v | | --------+
+ | +-------------------+ irq_entry | in_irq | |
+ | | | -----------> | | <-------+
+ | | disable_to_switch | +--------+
+ | | | --+
+ | +-------------------+ |
+ | | |
+ | | sched_switch |
+ | v |
+ | +-------------------+ |
+ | | switching | | irq_enable
+ | +-------------------+ |
+ | | |
+ | | irq_enable |
+ | v |
+ | +-------------------+ |
+ +-- | enable_to_exit | <-+
+ +-------------------+
+ ^ | irq_disable
+ | | irq_entry
+ +---------------+ irq_enable
+
+Monitor nrp
+-----------
+
+The need resched preempts (nrp) monitor ensures preemption requires
+``need_resched``. Only kernel preemption is considered, since preemption
+while returning to userspace, for this monitor, is indistinguishable from
+``sched_switch_yield`` (described in the sssw monitor).
+A kernel preemption is whenever ``__schedule`` is called with the preemption
+flag set to true (e.g. from preempt_enable or exiting from interrupts). This
+type of preemption occurs after the need for ``rescheduling`` has been set.
+This is not valid for the *lazy* variant of the flag, which causes only
+userspace preemption.
+A ``schedule_entry_preempt`` may involve a task switch or not, in the latter
+case, a task goes through the scheduler from a preemption context but it is
+picked as the next task to run. Since the scheduler runs, this clears the need
+to reschedule. The ``any_thread_running`` state does not imply the monitored
+task is not running as this monitor does not track the outcome of scheduling.
+
+In theory, a preemption can only occur after the ``need_resched`` flag is set. In
+practice, however, it is possible to see a preemption where the flag is not
+set. This can happen in one specific condition::
+
+ need_resched
+ preempt_schedule()
+ preempt_schedule_irq()
+ __schedule()
+ !need_resched
+ __schedule()
+
+In the situation above, standard preemption starts (e.g. from preempt_enable
+when the flag is set), an interrupt occurs before scheduling and, on its exit
+path, it schedules, which clears the ``need_resched`` flag.
+When the preempted task runs again, the standard preemption started earlier
+resumes, although the flag is no longer set. The monitor considers this a
+``nested_preemption``, this allows another preemption without re-setting the
+flag. This condition relaxes the monitor constraints and may catch false
+negatives (i.e. no real ``nested_preemptions``) but makes the monitor more
+robust and able to validate other scenarios.
+For simplicity, the monitor starts in ``preempt_irq``, although no interrupt
+occurred, as the situation above is hard to pinpoint::
+
+ schedule_entry
+ irq_entry #===========================================#
+ +-------------------------- H H
+ | H H
+ +-------------------------> H any_thread_running H
+ H H
+ +-------------------------> H H
+ | #===========================================#
+ | schedule_entry | ^
+ | schedule_entry_preempt | sched_need_resched | schedule_entry
+ | | schedule_entry_preempt
+ | v |
+ | +----------------------+ |
+ | +--- | | |
+ | sched_need_resched | | rescheduling | -+
+ | +--> | |
+ | +----------------------+
+ | | irq_entry
+ | v
+ | +----------------------+
+ | | | ---+
+ | ---> | | | sched_need_resched
+ | | preempt_irq | | irq_entry
+ | | | <--+
+ | | | <--+
+ | +----------------------+ |
+ | | schedule_entry | sched_need_resched
+ | | schedule_entry_preempt |
+ | v |
+ | +-----------------------+ |
+ +-------------------------- | nested_preempt | --+
+ +-----------------------+
+ ^ irq_entry |
+ +-------------------+
+
+Due to how the ``need_resched`` flag on the preemption count works on arm64,
+this monitor is unstable on that architecture, as it often records preemption
+when the flag is not set, even in presence of the workaround above.
+For the time being, the monitor is disabled by default on arm64.
+
+Monitor sssw
+------------
+
+The set state sleep and wakeup (sssw) monitor ensures ``set_state`` to
+sleepable leads to sleeping and sleeping tasks require wakeup. It includes the
+following types of switch:
+
+* ``switch_suspend``:
+ a task puts itself to sleep, this can happen only after explicitly setting
+ the task to ``sleepable``. After a task is suspended, it needs to be woken up
+ (``waking`` state) before being switched in again.
+ Setting the task's state to ``sleepable`` can be reverted before switching if it
+ is woken up or set to ``runnable``.
+* ``switch_blocking``:
+ a special case of a ``switch_suspend`` where the task is waiting on a
+ sleeping RT lock (``PREEMPT_RT`` only), it is common to see wakeup and set
+ state events racing with each other and this leads the model to perceive this
+ type of switch when the task is not set to sleepable. This is a limitation of
+ the model in SMP system and workarounds may slow down the system.
+* ``switch_preempt``:
+ a task switch as a result of kernel preemption (``schedule_entry_preempt`` in
+ the nrp model).
+* ``switch_yield``:
+ a task explicitly calls the scheduler or is preempted while returning to
+ userspace. It can happen after a ``yield`` system call, from the idle task or
+ if the ``need_resched`` flag is set. By definition, a task cannot yield while
+ ``sleepable`` as that would be a suspension. A special case of a yield occurs
+ when a task in ``TASK_INTERRUPTIBLE`` calls the scheduler while a signal is
+ pending. The task doesn't go through the usual blocking/waking and is set
+ back to runnable, the resulting switch (if there) looks like a yield to the
+ ``signal_wakeup`` state and is followed by the signal delivery. From this
+ state, the monitor expects a signal even if it sees a wakeup event, although
+ not necessary, to rule out false negatives.
+
+This monitor doesn't include a running state, ``sleepable`` and ``runnable``
+are only referring to the task's desired state, which could be scheduled out
+(e.g. due to preemption). However, it does include the event
+``sched_switch_in`` to represent when a task is allowed to become running. This
+can be triggered also by preemption, but cannot occur after the task got to
+``sleeping`` before a ``wakeup`` occurs::
+
+ +--------------------------------------------------------------------------+
+ | |
+ | |
+ | switch_suspend | |
+ | switch_blocking | |
+ v v |
+ +----------+ #==========================# set_state_runnable |
+ | | H H wakeup |
+ | | H H switch_in |
+ | | H H switch_yield |
+ | sleeping | H H switch_preempt |
+ | | H H signal_deliver |
+ | | switch_ H H ------+ |
+ | | _blocking H runnable H | |
+ | | <----------- H H <-----+ |
+ +----------+ H H |
+ | wakeup H H |
+ +---------------------> H H |
+ H H |
+ +---------> H H |
+ | #==========================# |
+ | | ^ |
+ | | | set_state_runnable |
+ | | | wakeup |
+ | set_state_sleepable | +------------------------+
+ | v | |
+ | +--------------------------+ set_state_sleepable
+ | | | switch_in
+ | | | switch_preempt
+ signal_deliver | sleepable | signal_deliver
+ | | | ------+
+ | | | |
+ | | | <-----+
+ | +--------------------------+
+ | | ^
+ | switch_yield | set_state_sleepable
+ | v |
+ | +---------------+ |
+ +---------- | signal_wakeup | -+
+ +---------------+
+ ^ | switch_in
+ | | switch_preempt
+ | | switch_yield
+ +-----------+ wakeup
+
+Monitor opid
+------------
+
+The operations with preemption and irq disabled (opid) monitor ensures
+operations like ``wakeup`` and ``need_resched`` occur with interrupts and
+preemption disabled or during interrupt context, in such case preemption may
+not be disabled explicitly.
+``need_resched`` can be set by some RCU internals functions, in which case it
+doesn't match a task wakeup and might occur with only interrupts disabled::
+
+ | sched_need_resched
+ | sched_waking
+ | irq_entry
+ | +--------------------+
+ v v |
+ +------------------------------------------------------+
+ +----------- | disabled | <+
+ | +------------------------------------------------------+ |
+ | | ^ |
+ | | preempt_disable sched_need_resched |
+ | preempt_enable | +--------------------+ |
+ | v | v | |
+ | +------------------------------------------------------+ |
+ | | irq_disabled | |
+ | +------------------------------------------------------+ |
+ | | | ^ |
+ | irq_entry irq_entry | | |
+ | sched_need_resched v | irq_disable |
+ | sched_waking +--------------+ | | |
+ | +----- | | irq_enable | |
+ | | | in_irq | | | |
+ | +----> | | | | |
+ | +--------------+ | | irq_disable
+ | | | | |
+ | irq_enable | irq_enable | | |
+ | v v | |
+ | #======================================================# |
+ | H enabled H |
+ | #======================================================# |
+ | | ^ ^ preempt_enable | |
+ | preempt_disable preempt_enable +--------------------+ |
+ | v | |
+ | +------------------+ | |
+ +----------> | preempt_disabled | -+ |
+ +------------------+ |
+ | |
+ +-------------------------------------------------------+
+
+This monitor is designed to work on ``PREEMPT_RT`` kernels, the special case of
+events occurring in interrupt context is a shortcut to identify valid scenarios
+where the preemption tracepoints might not be visible, during interrupts
+preemption is always disabled. On non- ``PREEMPT_RT`` kernels, the interrupts
+might invoke a softirq to set ``need_resched`` and wake up a task. This is
+another special case that is currently not supported by the monitor.
References
----------