diff options
Diffstat (limited to 'Documentation/trace/rv/monitor_synthesis.rst')
-rw-r--r-- | Documentation/trace/rv/monitor_synthesis.rst | 271 |
1 files changed, 271 insertions, 0 deletions
diff --git a/Documentation/trace/rv/monitor_synthesis.rst b/Documentation/trace/rv/monitor_synthesis.rst new file mode 100644 index 000000000000..ac808a7554f5 --- /dev/null +++ b/Documentation/trace/rv/monitor_synthesis.rst @@ -0,0 +1,271 @@ +Runtime Verification Monitor Synthesis +====================================== + +The starting point for the application of runtime verification (RV) techniques +is the *specification* or *modeling* of the desired (or undesired) behavior +of the system under scrutiny. + +The formal representation needs to be then *synthesized* into a *monitor* +that can then be used in the analysis of the trace of the system. The +*monitor* connects to the system via an *instrumentation* that converts +the events from the *system* to the events of the *specification*. + + +In Linux terms, the runtime verification monitors are encapsulated inside +the *RV monitor* abstraction. The RV monitor includes a set of instances +of the monitor (per-cpu monitor, per-task monitor, and so on), the helper +functions that glue the monitor to the system reference model, and the +trace output as a reaction to event parsing and exceptions, as depicted +below:: + + Linux +----- RV Monitor ----------------------------------+ Formal + Realm | | Realm + +-------------------+ +----------------+ +-----------------+ + | Linux kernel | | Monitor | | Reference | + | Tracing | -> | Instance(s) | <- | Model | + | (instrumentation) | | (verification) | | (specification) | + +-------------------+ +----------------+ +-----------------+ + | | | + | V | + | +----------+ | + | | Reaction | | + | +--+--+--+-+ | + | | | | | + | | | +-> trace output ? | + +------------------------|--|----------------------+ + | +----> panic ? + +-------> <user-specified> + +RV monitor synthesis +-------------------- + +The synthesis of a specification into the Linux *RV monitor* abstraction is +automated by the rvgen tool and the header file containing common code for +creating monitors. The header files are: + + * rv/da_monitor.h for deterministic automaton monitor. + * rv/ltl_monitor.h for linear temporal logic monitor. + +rvgen +----- + +The rvgen utility converts a specification into the C presentation and creating +the skeleton of a kernel monitor in C. + +For example, it is possible to transform the wip.dot model present in +[1] into a per-cpu monitor with the following command:: + + $ rvgen monitor -c da -s wip.dot -t per_cpu + +This will create a directory named wip/ with the following files: + +- wip.h: the wip model in C +- wip.c: the RV monitor + +The wip.c file contains the monitor declaration and the starting point for +the system instrumentation. + +Similarly, a linear temporal logic monitor can be generated with the following +command:: + + $ rvgen monitor -c ltl -s pagefault.ltl -t per_task + +This generates pagefault/ directory with: + +- pagefault.h: The Buchi automaton (the non-deterministic state machine to + verify the specification) +- pagefault.c: The skeleton for the RV monitor + +Monitor header files +-------------------- + +The header files: + +- `rv/da_monitor.h` for deterministic automaton monitor +- `rv/ltl_monitor` for linear temporal logic monitor + +include common macros and static functions for implementing *Monitor +Instance(s)*. + +The benefits of having all common functionalities in a single header file are +3-fold: + + - Reduce the code duplication; + - Facilitate the bug fix/improvement; + - Avoid the case of developers changing the core of the monitor code to + manipulate the model in a (let's say) non-standard way. + +rv/da_monitor.h ++++++++++++++++ + +This initial implementation presents three different types of monitor instances: + +- ``#define DECLARE_DA_MON_GLOBAL(name, type)`` +- ``#define DECLARE_DA_MON_PER_CPU(name, type)`` +- ``#define DECLARE_DA_MON_PER_TASK(name, type)`` + +The first declares the functions for a global deterministic automata monitor, +the second for monitors with per-cpu instances, and the third with per-task +instances. + +In all cases, the 'name' argument is a string that identifies the monitor, and +the 'type' argument is the data type used by rvgen on the representation of +the model in C. + +For example, the wip model with two states and three events can be +stored in an 'unsigned char' type. Considering that the preemption control +is a per-cpu behavior, the monitor declaration in the 'wip.c' file is:: + + DECLARE_DA_MON_PER_CPU(wip, unsigned char); + +The monitor is executed by sending events to be processed via the functions +presented below:: + + da_handle_event_$(MONITOR_NAME)($(event from event enum)); + da_handle_start_event_$(MONITOR_NAME)($(event from event enum)); + da_handle_start_run_event_$(MONITOR_NAME)($(event from event enum)); + +The function ``da_handle_event_$(MONITOR_NAME)()`` is the regular case where +the event will be processed if the monitor is processing events. + +When a monitor is enabled, it is placed in the initial state of the automata. +However, the monitor does not know if the system is in the *initial state*. + +The ``da_handle_start_event_$(MONITOR_NAME)()`` function is used to notify the +monitor that the system is returning to the initial state, so the monitor can +start monitoring the next event. + +The ``da_handle_start_run_event_$(MONITOR_NAME)()`` function is used to notify +the monitor that the system is known to be in the initial state, so the +monitor can start monitoring and monitor the current event. + +Using the wip model as example, the events "preempt_disable" and +"sched_waking" should be sent to monitor, respectively, via [2]:: + + da_handle_event_wip(preempt_disable_wip); + da_handle_event_wip(sched_waking_wip); + +While the event "preempt_enabled" will use:: + + da_handle_start_event_wip(preempt_enable_wip); + +To notify the monitor that the system will be returning to the initial state, +so the system and the monitor should be in sync. + +rv/ltl_monitor.h +++++++++++++++++ +This file must be combined with the $(MODEL_NAME).h file (generated by `rvgen`) +to be complete. For example, for the `pagefault` monitor, the `pagefault.c` +source file must include:: + + #include "pagefault.h" + #include <rv/ltl_monitor.h> + +(the skeleton monitor file generated by `rvgen` already does this). + +`$(MODEL_NAME).h` (`pagefault.h` in the above example) includes the +implementation of the Buchi automaton - a non-deterministic state machine that +verifies the LTL specification. While `rv/ltl_monitor.h` includes the common +helper functions to interact with the Buchi automaton and to implement an RV +monitor. An important definition in `$(MODEL_NAME).h` is:: + + enum ltl_atom { + LTL_$(FIRST_ATOMIC_PROPOSITION), + LTL_$(SECOND_ATOMIC_PROPOSITION), + ... + LTL_NUM_ATOM + }; + +which is the list of atomic propositions present in the LTL specification +(prefixed with "LTL\_" to avoid name collision). This `enum` is passed to the +functions interacting with the Buchi automaton. + +While generating code, `rvgen` cannot understand the meaning of the atomic +propositions. Thus, that task is left for manual work. The recommended pratice +is adding tracepoints to places where the atomic propositions change; and in the +tracepoints' handlers: the Buchi automaton is executed using:: + + void ltl_atom_update(struct task_struct *task, enum ltl_atom atom, bool value) + +which tells the Buchi automaton that the atomic proposition `atom` is now +`value`. The Buchi automaton checks whether the LTL specification is still +satisfied, and invokes the monitor's error tracepoint and the reactor if +violation is detected. + +Tracepoints and `ltl_atom_update()` should be used whenever possible. However, +it is sometimes not the most convenient. For some atomic propositions which are +changed in multiple places in the kernel, it is cumbersome to trace all those +places. Furthermore, it may not be important that the atomic propositions are +updated at precise times. For example, considering the following linear temporal +logic:: + + RULE = always (RT imply not PAGEFAULT) + +This LTL states that a real-time task does not raise page faults. For this +specification, it is not important when `RT` changes, as long as it has the +correct value when `PAGEFAULT` is true. Motivated by this case, another +function is introduced:: + + void ltl_atom_fetch(struct task_struct *task, struct ltl_monitor *mon) + +This function is called whenever the Buchi automaton is triggered. Therefore, it +can be manually implemented to "fetch" `RT`:: + + void ltl_atom_fetch(struct task_struct *task, struct ltl_monitor *mon) + { + ltl_atom_set(mon, LTL_RT, rt_task(task)); + } + +Effectively, whenever `PAGEFAULT` is updated with a call to `ltl_atom_update()`, +`RT` is also fetched. Thus, the LTL specification can be verified without +tracing `RT` everywhere. + +For atomic propositions which act like events, they usually need to be set (or +cleared) and then immediately cleared (or set). A convenient function is +provided:: + + void ltl_atom_pulse(struct task_struct *task, enum ltl_atom atom, bool value) + +which is equivalent to:: + + ltl_atom_update(task, atom, value); + ltl_atom_update(task, atom, !value); + +To initialize the atomic propositions, the following function must be +implemented:: + + ltl_atoms_init(struct task_struct *task, struct ltl_monitor *mon, bool task_creation) + +This function is called for all running tasks when the monitor is enabled. It is +also called for new tasks created after the enabling the monitor. It should +initialize as many atomic propositions as possible, for example:: + + void ltl_atom_init(struct task_struct *task, struct ltl_monitor *mon, bool task_creation) + { + ltl_atom_set(mon, LTL_RT, rt_task(task)); + if (task_creation) + ltl_atom_set(mon, LTL_PAGEFAULT, false); + } + +Atomic propositions not initialized by `ltl_atom_init()` will stay in the +unknown state until relevant tracepoints are hit, which can take some time. As +monitoring for a task cannot be done until all atomic propositions is known for +the task, the monitor may need some time to start validating tasks which have +been running before the monitor is enabled. Therefore, it is recommended to +start the tasks of interest after enabling the monitor. + +Final remarks +------------- + +With the monitor synthesis in place using the header files and +rvgen, the developer's work should be limited to the instrumentation +of the system, increasing the confidence in the overall approach. + +[1] For details about deterministic automata format and the translation +from one representation to another, see:: + + Documentation/trace/rv/deterministic_automata.rst + +[2] rvgen appends the monitor's name suffix to the events enums to +avoid conflicting variables when exporting the global vmlinux.h +use by BPF programs. |